1
|
Innocenti G, Obara M, Costa B, Jacobsen H, Katzmarzyk M, Cicin-Sain L, Kalinke U, Galardini M. Real-time identification of epistatic interactions in SARS-CoV-2 from large genome collections. Genome Biol 2024; 25:228. [PMID: 39175058 PMCID: PMC11342480 DOI: 10.1186/s13059-024-03355-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 07/26/2024] [Indexed: 08/24/2024] Open
Abstract
BACKGROUND The emergence of the SARS-CoV-2 virus has highlighted the importance of genomic epidemiology in understanding the evolution of pathogens and guiding public health interventions. The Omicron variant in particular has underscored the role of epistasis in the evolution of lineages with both higher infectivity and immune escape, and therefore the necessity to update surveillance pipelines to detect them early on. RESULTS In this study, we apply a method based on mutual information between positions in a multiple sequence alignment, which is capable of scaling up to millions of samples. We show how it can reliably predict known experimentally validated epistatic interactions, even when using as little as 10,000 sequences, which opens the possibility of making it a near real-time prediction system. We test this possibility by modifying the method to account for the sample collection date and apply it retrospectively to multiple sequence alignments for each month between March 2020 and March 2023. We detected a cornerstone epistatic interaction in the Spike protein between codons 498 and 501 as soon as seven samples with a double mutation were present in the dataset, thus demonstrating the method's sensitivity. We test the ability of the method to make inferences about emerging interactions by testing candidates predicted after March 2023, which we validate experimentally. CONCLUSIONS We show how known epistatic interaction in SARS-CoV-2 can be detected with high sensitivity, and how emerging ones can be quickly prioritized for experimental validation, an approach that could be implemented downstream of pandemic genome sequencing efforts.
Collapse
Affiliation(s)
- Gabriel Innocenti
- Institute for Molecular Bacteriology, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany
- Center for Cancer Research, Medical University of Vienna, Vienna, Austria
| | - Maureen Obara
- Institute for Experimental Infection Research, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
| | - Bibiana Costa
- Institute for Experimental Infection Research, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
| | - Henning Jacobsen
- Helmholtz Centre for Infection Research, Department of Viral Immunology (VIRI), Brunswick, Germany
- Centre for Individualized Infection Medicine (CiiM) a Joint Venture of Helmholtz Centre for Infection Research and Hannover Medical School, Hannover, Germany
| | - Maeva Katzmarzyk
- Helmholtz Centre for Infection Research, Department of Viral Immunology (VIRI), Brunswick, Germany
- Centre for Individualized Infection Medicine (CiiM) a Joint Venture of Helmholtz Centre for Infection Research and Hannover Medical School, Hannover, Germany
| | - Luka Cicin-Sain
- Helmholtz Centre for Infection Research, Department of Viral Immunology (VIRI), Brunswick, Germany
- Centre for Individualized Infection Medicine (CiiM) a Joint Venture of Helmholtz Centre for Infection Research and Hannover Medical School, Hannover, Germany
| | - Ulrich Kalinke
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany
- Institute for Experimental Infection Research, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany
| | - Marco Galardini
- Institute for Molecular Bacteriology, TWINCORE Centre for Experimental and Clinical Infection Research, a joint venture between the Hannover Medical School (MHH) and the Helmholtz Centre for Infection Research (HZI), Hannover, Germany.
- Cluster of Excellence RESIST (EXC 2155), Hannover Medical School (MHH), Hannover, Germany.
| |
Collapse
|
2
|
Khurana MP, Curran-Sebastian J, Scheidwasser N, Morgenstern C, Rasmussen M, Fonager J, Stegger M, Tang MHE, Juul JL, Escobar-Herrera LA, Møller FT, Albertsen M, Kraemer MUG, du Plessis L, Jokelainen P, Lehmann S, Krause TG, Ullum H, Duchêne DA, Mortensen LH, Bhatt S. High-resolution epidemiological landscape from ~290,000 SARS-CoV-2 genomes from Denmark. Nat Commun 2024; 15:7123. [PMID: 39164246 PMCID: PMC11335946 DOI: 10.1038/s41467-024-51371-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 08/01/2024] [Indexed: 08/22/2024] Open
Abstract
Vast amounts of pathogen genomic, demographic and spatial data are transforming our understanding of SARS-CoV-2 emergence and spread. We examined the drivers of molecular evolution and spread of 291,791 SARS-CoV-2 genomes from Denmark in 2021. With a sequencing rate consistently exceeding 60%, and up to 80% of PCR-positive samples between March and November, the viral genome set is broadly whole-epidemic representative. We identify a consistent rise in viral diversity over time, with notable spikes upon the importation of novel variants (e.g., Delta and Omicron). By linking genomic data with rich individual-level demographic data from national registers, we find that individuals aged < 15 and > 75 years had a lower contribution to molecular change (i.e., branch lengths) compared to other age groups, but similar molecular evolutionary rates, suggesting a lower likelihood of introducing novel variants. Similarly, we find greater molecular change among vaccinated individuals, suggestive of immune evasion. We also observe evidence of transmission in rural areas to follow predictable diffusion processes. Conversely, urban areas are expectedly more complex due to their high mobility, emphasising the role of population structure in driving virus spread. Our analyses highlight the added value of integrating genomic data with detailed demographic and spatial information, particularly in the absence of structured infection surveys.
Collapse
Affiliation(s)
- Mark P Khurana
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark.
| | - Jacob Curran-Sebastian
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Neil Scheidwasser
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Christian Morgenstern
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Faculty of Medicine, Imperial College London, London, UK
| | - Morten Rasmussen
- Virus Research and Development Laboratory, Statens Serum Institut, Copenhagen, Denmark
| | - Jannik Fonager
- Virus Research and Development Laboratory, Statens Serum Institut, Copenhagen, Denmark
| | - Marc Stegger
- Department of Bacteria, Parasites and Fungi, Statens Serum Institut, Copenhagen, Denmark
- Antimicrobial Resistance and Infectious Diseases Laboratory, Harry Butler Institute, Murdoch University, Murdoch, WA, Australia
| | - Man-Hung Eric Tang
- Department of Bacteria, Parasites and Fungi, Statens Serum Institut, Copenhagen, Denmark
| | - Jonas L Juul
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | | | - Mads Albertsen
- Department of Chemistry and Bioscience, Aalborg University, Aalborg, Denmark
| | | | - Louis du Plessis
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Pikka Jokelainen
- Infectious Disease Preparedness, Statens Serum Institut, Copenhagen, Denmark
| | - Sune Lehmann
- Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Tyra G Krause
- Epidemiological Infectious Disease Preparedness, Statens Serum Institut Copenhagen, Copenhagen, Denmark
| | | | - David A Duchêne
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
| | - Laust H Mortensen
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- Statistics Denmark, Copenhagen, Denmark
| | - Samir Bhatt
- Section of Epidemiology, Department of Public Health, University of Copenhagen, Copenhagen, Denmark
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Faculty of Medicine, Imperial College London, London, UK
| |
Collapse
|
3
|
Casimiro-Soriguer CS, Pérez-Florido J, Robles EA, Lara M, Aguado A, Rodríguez Iglesias MA, Lepe JA, García F, Pérez-Alegre M, Andújar E, Jiménez VE, Camino LP, Loruso N, Ameyugo U, Vazquez IM, Lozano CM, Chaves JA, Dopazo J. The integrated genomic surveillance system of Andalusia (SIEGA) provides a One Health regional resource connected with the clinic. Sci Rep 2024; 14:19200. [PMID: 39160186 PMCID: PMC11333592 DOI: 10.1038/s41598-024-70107-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 08/13/2024] [Indexed: 08/21/2024] Open
Abstract
The One Health approach, recognizing the interconnectedness of human, animal, and environmental health, has gained significance amid emerging zoonotic diseases and antibiotic resistance concerns. This paper aims to demonstrate the utility of a collaborative tool, the SIEGA, for monitoring infectious diseases across domains, fostering a comprehensive understanding of disease dynamics and risk factors, highlighting the pivotal role of One Health surveillance systems. Raw whole-genome sequencing is processed through different species-specific open software that additionally reports the presence of genes associated to anti-microbial resistances and virulence. The SIEGA application is a Laboratory Information Management System, that allows customizing reports, detect transmission chains, and promptly alert on alarming genetic similarities. The SIEGA initiative has successfully accumulated a comprehensive collection of more than 1900 bacterial genomes, including Salmonella enterica, Listeria monocytogenes, Campylobacter jejuni, Escherichia coli, Yersinia enterocolitica and Legionella pneumophila, showcasing its potential in monitoring pathogen transmission, resistance patterns, and virulence factors. SIEGA enables customizable reports and prompt detection of transmission chains, highlighting its contribution to enhancing vigilance and response capabilities. Here we show the potential of genomics in One Health surveillance when supported by an appropriate bioinformatic tool. By facilitating precise disease control strategies and antimicrobial resistance management, SIEGA enhances global health security and reduces the burden of infectious diseases. The integration of health data from humans, animals, and the environment, coupled with advanced genomics, underscores the importance of a holistic One Health approach in mitigating health threats.
Collapse
Affiliation(s)
- Carlos S Casimiro-Soriguer
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, 41013, Seville, Spain
| | - Javier Pérez-Florido
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, 41013, Seville, Spain
| | - Enrique A Robles
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
| | - María Lara
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
| | - Andrea Aguado
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
| | | | - José A Lepe
- Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, 41013, Seville, Spain
- Servicio de Microbiología, Unidad Clínica Enfermedades Infecciosas, Microbiología y Medicina Preventiva, Hospital Universitario Virgen del Rocío, 41013, Sevilla, Spain
- Centro de Investigación Biomédica en Red en Enfermedades Infecciosas (CIBERINFEC), ISCIII, Madrid, Spain
| | - Federico García
- Centro de Investigación Biomédica en Red en Enfermedades Infecciosas (CIBERINFEC), ISCIII, Madrid, Spain
- Servicio de Microbiología. Hospital Universitario San Cecilio, 18016, Granada, Spain
- Instituto de Investigación Biosanitaria, Ibs.GRANADA, 18012, Granada, Spain
| | - Mónica Pérez-Alegre
- Genomic Unit, Andalusian Molecular Biology and Regenerative Medicine Center (CABIMER), CSIC University of Seville University Pablo de Olavide, Seville, Spain
| | - Eloísa Andújar
- Genomic Unit, Andalusian Molecular Biology and Regenerative Medicine Center (CABIMER), CSIC University of Seville University Pablo de Olavide, Seville, Spain
| | - Victoria E Jiménez
- Genomic Unit, Andalusian Molecular Biology and Regenerative Medicine Center (CABIMER), CSIC University of Seville University Pablo de Olavide, Seville, Spain
| | - Lola P Camino
- Genomic Unit, Andalusian Molecular Biology and Regenerative Medicine Center (CABIMER), CSIC University of Seville University Pablo de Olavide, Seville, Spain
| | - Nicola Loruso
- Dirección General de Salud Pública y Ordenación Farmacéutica, Consejería de Salud y Consumo- Junta de Andalucía, Seville, Spain
| | - Ulises Ameyugo
- Dirección General de Salud Pública y Ordenación Farmacéutica, Consejería de Salud y Consumo- Junta de Andalucía, Seville, Spain
| | - Isabel María Vazquez
- Dirección General de Salud Pública y Ordenación Farmacéutica, Consejería de Salud y Consumo- Junta de Andalucía, Seville, Spain
| | - Carlota M Lozano
- Dirección General de Salud Pública y Ordenación Farmacéutica, Consejería de Salud y Consumo- Junta de Andalucía, Seville, Spain
| | - J Alberto Chaves
- Dirección General de Salud Pública y Ordenación Farmacéutica, Consejería de Salud y Consumo- Junta de Andalucía, Seville, Spain
| | - Joaquin Dopazo
- Andalusian Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain.
- Institute of Biomedicine of Seville, IBiS, University Hospital Virgen del Rocío/CSIC/University of Seville, 41013, Seville, Spain.
| |
Collapse
|
4
|
Poon AFY. Prospects for a sequence-based taxonomy of influenza A virus subtypes. Virus Evol 2024; 10:veae064. [PMID: 39247559 PMCID: PMC11378807 DOI: 10.1093/ve/veae064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 05/03/2024] [Accepted: 08/09/2024] [Indexed: 09/10/2024] Open
Abstract
Hemagglutinin (HA) and neuraminidase (NA) proteins are the primary antigenic targets of influenza A virus (IAV) infections. IAV infections are generally classified into subtypes of HA and NA proteins, e.g. H3N2. Most of the known subtypes were originally defined by a lack of antibody cross-reactivity. However, genetic sequencing has played an increasingly important role in characterizing the evolving diversity of IAV. Novel subtypes have recently been described solely by their genetic sequences, and IAV infections are routinely subtyped by molecular assays, or the comparison of sequences to references. In this study, I carry out a comparative analysis of all available IAV protein sequences in the Genbank database (over 1.1 million, reduced to 272,292 unique sequences prior to phylogenetic reconstruction) to determine whether the serologically defined subtypes can be reproduced with sequence-based criteria. I show that a robust genetic taxonomy of HA and NA subtypes can be obtained using a simple clustering method, namely, by progressively partitioning the phylogeny on its longest internal branches. However, this taxonomy also requires some amendments to the current nomenclature. For example, two IAV isolates from bats previously characterized as a divergent lineage of H9N2 should be separated into their own subtype. With the exception of these small and highly divergent lineages, the phylogenies relating each of the other six genomic segments do not support partitions into major subtypes.
Collapse
Affiliation(s)
- Art F Y Poon
- Department of Pathology & Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044, London, Ontario N6A 5C1, Canada
- Department of Microbiology & Immunology, Western University, 1151 Richmond Street, London, Ontario N6A 3K7, Canada
- Department of Computer Science, Western University, Room 355, Middlesex College, London N6A 5B7, Canada
| |
Collapse
|
5
|
Chaguza C, Chibwe I, Chaima D, Musicha P, Ndeketa L, Kasambara W, Mhango C, Mseka UL, Bitilinyu-Bangoh J, Mvula B, Kipandula W, Bonongwe P, Munthali RJ, Ngwira S, Mwendera CA, Kalizang'oma A, Jambo KC, Kambalame D, Kamng'ona AW, Steele AD, Chauma-Mwale A, Hungerford D, Kagoli M, Nyaga MM, Dube Q, French N, Msefula CL, Cunliffe NA, Jere KC. Genomic insights into the 2022-2023Vibrio cholerae outbreak in Malawi. Nat Commun 2024; 15:6291. [PMID: 39060226 PMCID: PMC11282309 DOI: 10.1038/s41467-024-50484-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 07/09/2024] [Indexed: 07/28/2024] Open
Abstract
Malawi experienced its deadliest Vibrio cholerae (Vc) outbreak following devastating cyclones, with >58,000 cases and >1700 deaths reported between March 2022 and May 2023. Here, we use population genomics to investigate the attributes and origin of the Malawi 2022-2023 Vc outbreak isolates. Our results demonstrate the predominance of ST69 clone, also known as the seventh cholera pandemic El Tor (7PET) lineage, expressing O1 Ogawa (~ 80%) serotype followed by Inaba (~ 16%) and sporadic non-O1/non-7PET serogroups (~ 4%). Phylogenetic reconstruction revealed that the Malawi outbreak strains correspond to a recent importation from Asia into Africa (sublineage AFR15). These isolates harboured known antimicrobial resistance and virulence elements, notably the ICEGEN/ICEVchHai1/ICEVchind5 SXT/R391-like integrative conjugative elements and a CTXφ prophage with the ctxB7 genotype compared to historical Malawian Vc isolates. These data suggest that the devastating cyclones coupled with the recent importation of 7PET serogroup O1 strains, may explain the magnitude of the 2022-2023 cholera outbreak in Malawi.
Collapse
Affiliation(s)
- Chrispin Chaguza
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, Yale University, New Haven, CT, USA.
- Yale Institute for Global Health, Yale University, New Haven, CT, USA.
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK.
- NIHR Mucosal Pathogens Research Unit, Research Department of Infection, Division of Infection and Immunity, University College London, London, UK.
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton, UK.
| | - Innocent Chibwe
- Public Health Institute of Malawi, Ministry of Health, Lilongwe, Malawi
| | - David Chaima
- Department of Pathology, School of Medicine and Oral Health, Kamuzu University of Health Sciences, Blantyre, Malawi
| | - Patrick Musicha
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton, UK
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
| | - Latif Ndeketa
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
| | | | | | - Upendo L Mseka
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
| | | | - Bernard Mvula
- Public Health Institute of Malawi, Ministry of Health, Lilongwe, Malawi
| | - Wakisa Kipandula
- Department of Medical Laboratory Sciences, Faculty of Biomedical Sciences and Health profession, Kamuzu University of Health Sciences, Blantyre, Malawi
| | - Patrick Bonongwe
- Ministry of Health, Balaka District Hospital, Balaka, Machinga, Malawi
| | - Richard J Munthali
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| | - Selemani Ngwira
- Public Health Institute of Malawi, Ministry of Health, Lilongwe, Malawi
| | - Chikondi A Mwendera
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
| | - Akuzike Kalizang'oma
- NIHR Mucosal Pathogens Research Unit, Research Department of Infection, Division of Infection and Immunity, University College London, London, UK
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
| | - Kondwani C Jambo
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
- Department of Clinical Sciences, Liverpool School of Tropical Medicine, Liverpool, UK
| | | | - Arox W Kamng'ona
- Department of Biomedical Sciences, School of Life Sciences and Allied Health Professions, Kamuzu University of Health Sciences, Blantyre, Malawi
| | - A Duncan Steele
- Diarrhoeal Pathogens Research Unit, Sefako Makgatho Health Sciences University, Medunsa, 0204, Pretoria, South Africa
| | | | - Daniel Hungerford
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
- NIHR Health Protection Research Unit in Gastrointestinal Infections, University of Liverpool, Liverpool, UK
| | - Matthew Kagoli
- Public Health Institute of Malawi, Ministry of Health, Lilongwe, Malawi
| | - Martin M Nyaga
- Next Generation Sequencing Unit and Division of Virology, Faculty of Health Sciences, University of the Free State, Bloemfontein, 9300, South Africa
| | - Queen Dube
- Malawi Ministry of Health, Lilongwe, Malawi
| | - Neil French
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi
| | - Chisomo L Msefula
- Department of Pathology, School of Medicine and Oral Health, Kamuzu University of Health Sciences, Blantyre, Malawi
| | - Nigel A Cunliffe
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
- NIHR Health Protection Research Unit in Gastrointestinal Infections, University of Liverpool, Liverpool, UK
- NIHR Global Health Research Group on Gastrointestinal Infections, University of Liverpool, Liverpool, UK
| | - Khuzwayo C Jere
- Department of Clinical Infection, Microbiology and Immunology, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK.
- Malawi-Liverpool-Wellcome Research Programme, Blantyre, Malawi.
- Department of Medical Laboratory Sciences, Faculty of Biomedical Sciences and Health profession, Kamuzu University of Health Sciences, Blantyre, Malawi.
- NIHR Health Protection Research Unit in Gastrointestinal Infections, University of Liverpool, Liverpool, UK.
- NIHR Global Health Research Group on Gastrointestinal Infections, University of Liverpool, Liverpool, UK.
| |
Collapse
|
6
|
Ma W, Fu H, Jian F, Cao Y, Li M. Distinct SARS-CoV-2 populational immune backgrounds tolerate divergent RBD evolutionary preferences. Natl Sci Rev 2024; 11:nwae196. [PMID: 39071101 PMCID: PMC11275455 DOI: 10.1093/nsr/nwae196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 06/02/2024] [Accepted: 06/03/2024] [Indexed: 07/30/2024] Open
Abstract
Immune evasion is a pivotal force shaping the evolution of viruses. Nonetheless, the extent to which virus evolution varies among populations with diverse immune backgrounds remains an unsolved mystery. Prior to the widespread SARS-CoV-2 infections in December 2022 and January 2023, the Chinese population possessed a markedly distinct (less potent) immune background due to its low infection rate, compared to countries experiencing multiple infection waves, presenting an unprecedented opportunity to investigate how the virus has evolved under different immune contexts. We compared the mutation spectrum and functional potential of the newly derived mutations that occurred in BA.5.2.48, BF.7.14 and BA.5.2.49-variants prevalent in China-with their counterparts in other countries. We found that the emerging mutations in the receptor-binding-domain region in these lineages were more widely dispersed and evenly distributed across different epitopes. These mutations led to a higher angiotensin-converting enzyme 2 (ACE2) binding affinity and reduced potential for immune evasion compared to their counterparts in other countries. These findings suggest a milder immune pressure and less evident immune imprinting within the Chinese population. Despite the emergence of numerous immune-evading variants in China, none of them outcompeted the original strain until the arrival of the XBB variant, which had stronger immune evasion and subsequently outcompeted all circulating variants. Our findings demonstrated that the continuously changing immune background led to varying evolutionary pressures on SARS-CoV-2. Thus, in addition to viral genome surveillance, immune background surveillance is also imperative for predicting forthcoming mutations and understanding how these variants spread in the population.
Collapse
Affiliation(s)
- Wentai Ma
- Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haoyi Fu
- Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fanchong Jian
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
| | - Yunlong Cao
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing 100871, China
- Changping Laboratory, Beijing 102206, China
| | - Mingkun Li
- Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
7
|
Featherstone LA, Wirth W. PhyloJS: Bridging phylogenetics and web development with a JavaScript utility library. Ecol Evol 2024; 14:e11603. [PMID: 38932954 PMCID: PMC11199911 DOI: 10.1002/ece3.11603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 05/24/2024] [Accepted: 06/05/2024] [Indexed: 06/28/2024] Open
Abstract
There is an increasing number of libraries devoted to parsing, manipulating and visualising phylogenetic trees in JavaScript. Many of these libraries bundle tree manipulation with visualisation, but have limited ability to manipulate trees and lack detailed documentation. As the number of web-based phylogenetic tools and the size of phylogenetics datasets increases, there is a need for a library that parses, writes and manipulates phylogenetic trees that is interoperable with other phylogenetic and data visualisation libraries. Here we introduce PhyloJS, a light zero-dependency TypeScript and JavaScript library for reading, writing and manipulating phylogenetic trees. PhyloJS allows for modification of and data-extraction from trees to integrate with other phylogenetics and data visualisation libraries. It can swiftly handle large trees, up to at least 10 6 tips in size, making it ideal for developing the next generation of more complex web-based phylogenetics applications handling ever larger datasets. The PhyloJS source code is available on GitHub (https://github.com/clockor2/phylojs) and can be installed via npm with the command npm install phylojs. Extensive documentation is available at https://clockor2.github.io/phylojs/.
Collapse
Affiliation(s)
- Leo A. Featherstone
- Peter Doherty Institute for Infection and ImmunityUniversity of MelbourneMelbourneVictoriaAustralia
| | - Wytamma Wirth
- Peter Doherty Institute for Infection and ImmunityUniversity of MelbourneMelbourneVictoriaAustralia
| |
Collapse
|
8
|
Hunt M, Hinrichs AS, Anderson D, Karim L, Dearlove BL, Knaggs J, Constantinides B, Fowler PW, Rodger G, Street T, Lumley S, Webster H, Sanderson T, Ruis C, de Maio N, Amenga-Etego LN, Amuzu DSY, Avaro M, Awandare GA, Ayivor-Djanie R, Bashton M, Batty EM, Bediako Y, De Belder D, Benedetti E, Bergthaler A, Boers SA, Campos J, Carr RAA, Cuba F, Dattero ME, Dejnirattisai W, Dilthey A, Duedu KO, Endler L, Engelmann I, Francisco NM, Fuchs J, Gnimpieba EZ, Groc S, Gyamfi J, Heemskerk D, Houwaart T, Hsiao NY, Huska M, Hölzer M, Iranzadeh A, Jarva H, Jeewandara C, Jolly B, Joseph R, Kant R, Ki KKK, Kurkela S, Lappalainen M, Lataretu M, Liu C, Malavige GN, Mashe T, Mongkolsapaya J, Montes B, Molina Mora JA, Morang'a CM, Mvula B, Nagarajan N, Nelson A, Ngoi JM, da Paixão JP, Panning M, Poklepovich T, Quashie PK, Ranasinghe D, Russo M, San JE, Sanderson ND, Scaria V, Screaton G, Sironen T, Sisay A, Smith D, Smura T, Supasa P, Suphavilai C, Swann J, Tegally H, Tegomoh B, Vapalahti O, Walker A, Wilkinson RJ, Williamson C, de Oliveira T, Peto TE, Crook D, Corbett-Detig R, Iqbal Z. Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591666. [PMID: 38746185 PMCID: PMC11092452 DOI: 10.1101/2024.04.29.591666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The SARS-CoV-2 genome occupies a unique place in infection biology - it is the most highly sequenced genome on earth (making up over 20% of public sequencing datasets) with fine scale information on sampling date and geography, and has been subject to unprecedented intense analysis. As a result, these phylogenetic data are an incredibly valuable resource for science and public health. However, the vast majority of the data was sequenced by tiling amplicons across the full genome, with amplicon schemes that changed over the pandemic as mutations in the viral genome interacted with primer binding sites. In combination with the disparate set of genome assembly workflows and lack of consistent quality control (QC) processes, the current genomes have many systematic errors that have evolved with the virus and amplicon schemes. These errors have significant impacts on the phylogeny, and therefore over the last few years, many thousands of hours of researchers time has been spent in "eyeballing" trees, looking for artefacts, and then patching the tree. Given the huge value of this dataset, we therefore set out to reprocess the complete set of public raw sequence data in a rigorous amplicon-aware manner, and build a cleaner phylogeny. Here we provide a global tree of 3,960,704 samples, built from a consistently assembled set of high quality consensus sequences from all available public data as of March 2023, viewable at https://viridian.taxonium.org. Each genome was constructed using a novel assembly tool called Viridian (https://github.com/iqbal-lab-org/viridian), developed specifically to process amplicon sequence data, eliminating artefactual errors and mask the genome at low quality positions. We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny. Phase 2 of our project will address the fact that the data in the public archives is heavily geographically biased towards the Global North. We therefore have contributed new raw data to ENA/SRA from many countries including Ghana, Thailand, Laos, Sri Lanka, India, Argentina and Singapore. We will incorporate these, along with all public raw data submitted between March 2023 and the current day, into an updated set of assemblies, and phylogeny. We hope the tree, consensus sequences and Viridian will be a valuable resource for researchers.
Collapse
Affiliation(s)
- Martin Hunt
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
| | - Daniel Anderson
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lily Karim
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Bethany L Dearlove
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Jeff Knaggs
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Bede Constantinides
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Philip W Fowler
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Gillian Rodger
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Teresa Street
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Sheila Lumley
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Department of Infectious Diseases and Microbiology, John Radcliffe Hospital, Oxford, UK
| | - Hermione Webster
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Christopher Ruis
- Victor Phillip Dahdaleh Heart & Lung Research Institute, University of Cambridge, Cambridge, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Nicola de Maio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lucas N Amenga-Etego
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Dominic S Y Amuzu
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Martin Avaro
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Gordon A Awandare
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Reuben Ayivor-Djanie
- Laboratory for Medical Biotechnology and Biomanufacturing, International Centre for Genetic Engineering and Biotechnology, Tristie, Italy
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
| | - Matthew Bashton
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Elizabeth M Batty
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
| | - Yaw Bediako
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Denise De Belder
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Estefania Benedetti
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Andreas Bergthaler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Stefan A Boers
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Josefina Campos
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Rosina Afua Ampomah Carr
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- Department of Computational Medicine and Bioinformatics, University of Michigan, Michigan, Ann Arbor, MI, USA
| | - Facundo Cuba
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Maria Elena Dattero
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Wanwisa Dejnirattisai
- Division of Emerging Infectious Disease, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkoknoi, Bangkok 10700, Thailand
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Kwabena Obeng Duedu
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- College of Life Sciences, Birmingham City University, Birmingham, UK
| | - Lukas Endler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Ilka Engelmann
- Pathogenesis and Control of Chronic and Emerging Infections, Univ Montpellier, INSERM, Etablissement Français du Sang, Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Ngiambudulu M Francisco
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Jonas Fuchs
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Etienne Z Gnimpieba
- Biomedical Engineering Department, University of South Dakota, Sioux Falls, SD 57107
| | - Soraya Groc
- Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Jones Gyamfi
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- School of Health and Life Sciences, Teesside University, Middlesbrough, UK
| | - Dennis Heemskerk
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Torsten Houwaart
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Nei-Yuan Hsiao
- Divison of Medical Virology, University of Cape Town and National Health Laboratory Service
| | - Matthew Huska
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | | | - Hanna Jarva
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Chandima Jeewandara
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Bani Jolly
- Karkinos Healthcare Private Limited (KHPL), Aurbis Business Parks, Bellandur, Bengaluru, Karnataka, 560103, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | | | - Ravi Kant
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- Department of Tropical Parasitology, Institute of Maritime and Tropical Medicine, Medical University of Gdansk, 81-519 Gdynia, Poland
| | | | - Satu Kurkela
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Maija Lappalainen
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Marie Lataretu
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Chang Liu
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Gathsaurie Neelika Malavige
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Tapfumanei Mashe
- Health System Strengthening Unit, World Health Organisation, Harare, Zimbabwe
| | - Juthathip Mongkolsapaya
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Jose Arturo Molina Mora
- Centro de investigación en Enfermedades Tropicales & Facultad de Microbiología, Universidad de Costa Rica, Costa Rica
| | - Collins M Morang'a
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Bernard Mvula
- Public Health Institute of Malawi, Ministry of Health, Malawi
| | - Niranjan Nagarajan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Andrew Nelson
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Joyce M Ngoi
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Joana Paula da Paixão
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Marcus Panning
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Tomas Poklepovich
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Peter K Quashie
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Diyanath Ranasinghe
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Mara Russo
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - James Emmanuel San
- Duke Human Vaccine Institute, Duke University, Durham, NC 27710
- University of KwaZulu Natal, Durban, South Africa, 4001
| | - Nicholas D Sanderson
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Vinod Scaria
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
- Vishwanath Cancer Care Foundation (VCCF), Neelkanth Business Park Kirol Village, West Mumbai, Maharashtra, 400086, India
| | - Gavin Screaton
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Tarja Sironen
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Abay Sisay
- Department of Medical Laboratory Sciences, College of Health Sciences, Addis Ababa University, P.O.Box 1176, Addis Ababa, Ethiopia
| | - Darren Smith
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Teemu Smura
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Piyada Supasa
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Chayaporn Suphavilai
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Jeremy Swann
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Houriiyah Tegally
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
| | - Bryan Tegomoh
- Centre de Coordination des Opérations d'Urgences de Santé Publique, Ministere de Sante Publique, Cameroun
- University of California, Berkeley, Berkeley, California, USA
- Nebraska Department of Health and Human Services, Lincoln, Nebraska, USA
| | - Olli Vapalahti
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Andreas Walker
- Institute of Virology, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Robert J Wilkinson
- Francis Crick Institute, London, UK
- Centre for Infectious Diseases Research in Africa, University of Cape Town
- Imperial College London, UK
| | | | - Tulio de Oliveira
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), University of KwaZulu-Natal, South Africa
| | - Timothy Ea Peto
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Derrick Crook
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Russell Corbett-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Zamin Iqbal
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Milner Centre for Evolution, University of Bath, UK
| |
Collapse
|
9
|
Kubinski HC, Despres HW, Johnson BA, Schmidt MM, Jaffrani SA, Mills MG, Lokugamage K, Dumas CM, Shirley DJ, Estes LK, Pekosz A, Crothers JW, Roychoudhury P, Greninger AL, Jerome KR, Di Genova BM, Walker DH, Ballif BA, Ladinsky MS, Bjorkman PJ, Menachery VD, Bruce EA. Variant mutation in SARS-CoV-2 nucleocapsid enhances viral infection via altered genomic encapsidation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.08.584120. [PMID: 38559000 PMCID: PMC10979914 DOI: 10.1101/2024.03.08.584120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The evolution of SARS-CoV-2 variants and their respective phenotypes represents an important set of tools to understand basic coronavirus biology as well as the public health implications of individual mutations in variants of concern. While mutations outside of Spike are not well studied, the entire viral genome is undergoing evolutionary selection, particularly the central disordered linker region of the nucleocapsid (N) protein. Here, we identify a mutation (G215C), characteristic of the Delta variant, that introduces a novel cysteine into this linker domain, which results in the formation of a disulfide bond and a stable N-N dimer. Using reverse genetics, we determined that this cysteine residue is necessary and sufficient for stable dimer formation in a WA1 SARS-CoV-2 background, where it results in significantly increased viral growth both in vitro and in vivo. Finally, we demonstrate that the N:G215C virus packages more nucleocapsid per virion and that individual virions are larger, with elongated morphologies.
Collapse
Affiliation(s)
- Hannah C. Kubinski
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| | - Hannah W. Despres
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| | - Bryan A. Johnson
- Department of Microbiology and Immunology, University of Texas Medical Branch, Galveston, Texas, USA
- Institute for Human Infection and Immunity, University of Texas Medical Branch, Galveston, TX, USA
- Center for Tropical Diseases, University of Texas Medical Branch, Galveston, TX, USA
| | - Madaline M. Schmidt
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| | - Sara A. Jaffrani
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| | - Margaret G. Mills
- Virology Division, Department of Laboratory Medicine and Pathology, University of Washington, Seattle WA 98195, USA
| | - Kumari Lokugamage
- Department of Microbiology and Immunology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Caroline M. Dumas
- Department of Biology, University of Vermont 109 Carrigan Drive, 120A Marsh Life Sciences, Burlington VT 05404, USA
| | - David J. Shirley
- Faraday, Inc. Data Science Department. Burlington VT, 05405, USA
| | - Leah K. Estes
- Department of Microbiology and Immunology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Andrew Pekosz
- W. Harry Feinstone Department of Molecular Microbiology and Immunology, The Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Jessica W. Crothers
- Department of Pathology and Laboratory Medicine, Robert Larner, MD College of Medicine, University of Vermont, Burlington, VT, USA
| | - Pavitra Roychoudhury
- Virology Division, Department of Laboratory Medicine and Pathology, University of Washington, Seattle WA 98195, USA
| | - Alexander L. Greninger
- Virology Division, Department of Laboratory Medicine and Pathology, University of Washington, Seattle WA 98195, USA
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
| | - Keith R. Jerome
- Virology Division, Department of Laboratory Medicine and Pathology, University of Washington, Seattle WA 98195, USA
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle WA 98109, USA
| | - Bruno Martorelli Di Genova
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| | - David H. Walker
- Department of Pathology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Bryan A. Ballif
- Department of Biology, University of Vermont 109 Carrigan Drive, 120A Marsh Life Sciences, Burlington VT 05404, USA
| | - Mark S. Ladinsky
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA. 91125, USA
| | - Pamela J. Bjorkman
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA. 91125, USA
| | - Vineet D. Menachery
- Department of Microbiology and Immunology, University of Texas Medical Branch, Galveston, Texas, USA
- World Reference Center of Emerging Viruses and Arboviruses, University of Texas Medical Branch, Galveston, Texas, USA
- Center for Biodefense and Emerging Infectious Diseases, University of Texas Medical Branch, Galveston, Texas, USA
| | - Emily A. Bruce
- Department of Microbiology and Molecular Genetics, Robert Larner, M.D. College of Medicine, University of Vermont, Burlington VT, 05405, USA
| |
Collapse
|
10
|
Elko EA, Mead HL, Nelson GA, Zaia JA, Ladner JT, Altin JA. Recurrent SARS-CoV-2 mutations at Spike D796 evade antibodies from pre-Omicron convalescent and vaccinated subjects. Microbiol Spectr 2024; 12:e0329123. [PMID: 38189279 PMCID: PMC10871546 DOI: 10.1128/spectrum.03291-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 12/03/2023] [Indexed: 01/09/2024] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) lineages of the Omicron variant rapidly became dominant in early 2022 and frequently cause human infections despite vaccination or prior infection with other variants. In addition to antibody-evading mutations in the receptor-binding domain, Omicron features amino acid mutations elsewhere in the Spike protein; however, their effects generally remain ill defined. The Spike D796Y substitution is present in all Omicron sub-variants and occurs at the same site as a mutation (D796H) selected during viral evolution in a chronically infected patient. Here, we map antibody reactivity to a linear epitope in the Spike protein overlapping position 796. We show that antibodies binding this region arise in pre-Omicron SARS-CoV-2 convalescent and vaccinated subjects but that both D796Y and D796H abrogate their binding. These results suggest that D796Y contributes to the fitness of Omicron in hosts with pre-existing immunity to other variants of SARS-CoV-2 by evading antibodies targeting this site.IMPORTANCESevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved substantially through the coronavirus disease 2019 (COVID-19) pandemic: understanding the drivers and consequences of this evolution is essential for projecting the course of the pandemic and developing new countermeasures. Here, we study the immunological effects of a particular mutation present in the Spike protein of all Omicron strains and find that it prevents the efficient binding of a class of antibodies raised by pre-Omicron vaccination and infection. These findings reveal a novel consequence of a poorly understood Omicron mutation and shed light on the drivers and effects of SARS-CoV-2 evolution.
Collapse
Affiliation(s)
- Evan A. Elko
- The Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, Arizona, USA
| | - Heather L. Mead
- The Translational Genomics Research Institute (TGen), Flagstaff, Arizona, USA
| | - Georgia A. Nelson
- The Translational Genomics Research Institute (TGen), Flagstaff, Arizona, USA
| | - John A. Zaia
- Center for Gene Therapy, Department of Hematology and Hematopoietic Cell Transplantation, City of Hope National Medical Center, Duarte, California, USA
| | - Jason T. Ladner
- The Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, Arizona, USA
| | - John A. Altin
- The Translational Genomics Research Institute (TGen), Flagstaff, Arizona, USA
| |
Collapse
|
11
|
Rahman N, O'Cathail C, Zyoud A, Sokolov A, Oude Munnink B, Grüning B, Cummins C, Amid C, Nieuwenhuijse DF, Visontai D, Yuan DY, Gupta D, Prasad DK, Gulyás GM, Rinck G, McKinnon J, Rajan J, Knaggs J, Skiby JE, Stéger J, Szarvas J, Gueye K, Papp K, Hoek M, Kumar M, Ventouratou MA, Bouquieaux MC, Koliba M, Mansurova M, Haseeb M, Worp N, Harrison PW, Leinonen R, Thorne R, Selvakumar S, Hunt S, Venkataraman S, Jayathilaka S, Cezard T, Maier W, Waheed Z, Iqbal Z, Aarestrup FM, Csabai I, Koopmans M, Burdett T, Cochrane G. Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses. Microb Genom 2024; 10:001188. [PMID: 38358325 PMCID: PMC10926692 DOI: 10.1099/mgen.0.001188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 01/14/2024] [Indexed: 02/16/2024] Open
Abstract
The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
Collapse
Affiliation(s)
- Nadim Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Colman O'Cathail
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ahmad Zyoud
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Alexey Sokolov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Bas Oude Munnink
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Björn Grüning
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Clara Amid
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | | | - Dávid Visontai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - David Yu Yuan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Dipayan Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Divyae K. Prasad
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Gábor Máté Gulyás
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Gabriele Rinck
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jasmine McKinnon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeena Rajan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeff Knaggs
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Jeffrey Edward Skiby
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - József Stéger
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Judit Szarvas
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Khadim Gueye
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Krisztián Papp
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Maarten Hoek
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Manish Kumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Marianna A. Ventouratou
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Martin Koliba
- Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
| | - Milena Mansurova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Muhammad Haseeb
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Nathalie Worp
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Peter W. Harrison
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Rasko Leinonen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Ross Thorne
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sandeep Selvakumar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sarah Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Sundar Venkataraman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Suran Jayathilaka
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Timothée Cezard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Wolfgang Maier
- University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany
| | - Zahra Waheed
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Zamin Iqbal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | | | - Istvan Csabai
- Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary
| | - Marion Koopmans
- Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands
| | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK
| |
Collapse
|
12
|
McBroome J, de Bernardi Schneider A, Roemer C, Wolfinger MT, Hinrichs AS, O'Toole AN, Ruis C, Turakhia Y, Rambaut A, Corbett-Detig R. A framework for automated scalable designation of viral pathogen lineages from genomic data. Nat Microbiol 2024; 9:550-560. [PMID: 38316930 PMCID: PMC10847047 DOI: 10.1038/s41564-023-01587-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Accepted: 12/13/2023] [Indexed: 02/07/2024]
Abstract
Pathogen lineage nomenclature systems are a key component of effective communication and collaboration for researchers and public health workers. Since February 2021, the Pango dynamic lineage nomenclature for SARS-CoV-2 has been sustained by crowdsourced lineage proposals as new isolates were sequenced. This approach is vulnerable to time-critical delays as well as regional and personal bias. Here we developed a simple heuristic approach for dividing phylogenetic trees into lineages, including the prioritization of key mutations or genes. Our implementation is efficient on extremely large phylogenetic trees consisting of millions of sequences and produces similar results to existing manually curated lineage designations when applied to SARS-CoV-2 and other viruses including chikungunya virus, Venezuelan equine encephalitis virus complex and Zika virus. This method offers a simple, automated and consistent approach to pathogen nomenclature that can assist researchers in developing and maintaining phylogeny-based classifications in the face of ever-increasing genomic datasets.
Collapse
Affiliation(s)
- Jakob McBroome
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| | - Adriano de Bernardi Schneider
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Cornelius Roemer
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Michael T Wolfinger
- Department of Theoretical Chemistry, University of Vienna, Vienna, Austria
- Research Group Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
- RNA Forecast e.U., Vienna, Austria
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg, Germany
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Aine Niamh O'Toole
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, UK
| | - Christopher Ruis
- Molecular Immunity Unit, MRC Laboratory of Molecular Biology, Department of Medicine, University of Cambridge, Cambridge, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
- Cambridge Centre for AI in Medicine, University of Cambridge, Cambridge, UK
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - Andrew Rambaut
- Institute of Ecology and Evolution, University of Edinburgh, Edinburgh, UK
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
13
|
Reji L, Darnajoux R, Zhang X. A genomic view of environmental and life history controls on microbial nitrogen acquisition strategies. ENVIRONMENTAL MICROBIOLOGY REPORTS 2024; 16:e13220. [PMID: 38057292 PMCID: PMC10866080 DOI: 10.1111/1758-2229.13220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 11/15/2023] [Indexed: 12/08/2023]
Abstract
Microorganisms have evolved diverse strategies to acquire the vital element nitrogen (N) from the environment. Ecological and physiological controls on the distribution of these strategies among microbes remain unclear. In this study, we examine the distribution of 10 major N acquisition strategies in taxonomically and metabolically diverse microbial genomes, including those from the Genomic Catalogue of Earth's Microbiomes dataset. We utilize a marker gene-based approach to assess relationships between N acquisition strategy prevalence and microbial life history strategies. Our results underscore energetic costs of assimilation as a broad control on strategy distribution. The most prevalent strategies are the uptake of ammonium and simple amino acids, which have relatively low energetic costs, while energy-intensive biological nitrogen fixation is the least common. Deviations from the energy-based framework include the higher-than-expected prevalence of the assimilatory pathway for chitin, a large organic polymer. Energy availability is also important, with aerobic chemoorganotrophs and oxygenic phototrophs notably possessing ~2-fold higher numbers of total strategies compared to anaerobic microbes. Environmental controls are evidenced by the enrichment of inorganic N assimilation strategies among free-living taxa compared to host-associated taxa. Physiological constraints such as pathway incompatibility add complexity to N acquisition strategy distributions. Finally, we discuss the necessity for microbially-relevant spatiotemporal environmental metadata for improving mechanistic and prediction-oriented analyses of genomic data.
Collapse
Affiliation(s)
- Linta Reji
- Department of GeosciencesPrinceton UniversityPrincetonNew JerseyUSA
- High Meadows Environmental InstitutePrinceton UniversityPrincetonNew JerseyUSA
| | - Romain Darnajoux
- Department of GeosciencesPrinceton UniversityPrincetonNew JerseyUSA
| | - Xinning Zhang
- Department of GeosciencesPrinceton UniversityPrincetonNew JerseyUSA
- High Meadows Environmental InstitutePrinceton UniversityPrincetonNew JerseyUSA
| |
Collapse
|
14
|
Hinrichs A, Ye C, Turakhia Y, Corbett-Detig R. The ongoing evolution of UShER during the SARS-CoV-2 pandemic. Nat Genet 2024; 56:4-7. [PMID: 38155331 DOI: 10.1038/s41588-023-01622-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2023]
Affiliation(s)
- Angie Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Cheng Ye
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, USA
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, USA
| | - Russell Corbett-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
15
|
Penn MJ, Scheidwasser N, Penn J, Donnelly CA, Duchêne DA, Bhatt S. Leaping through Tree Space: Continuous Phylogenetic Inference for Rooted and Unrooted Trees. Genome Biol Evol 2023; 15:evad213. [PMID: 38085949 PMCID: PMC10745275 DOI: 10.1093/gbe/evad213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2023] [Indexed: 12/24/2023] Open
Abstract
Phylogenetics is now fundamental in life sciences, providing insights into the earliest branches of life and the origins and spread of epidemics. However, finding suitable phylogenies from the vast space of possible trees remains challenging. To address this problem, for the first time, we perform both tree exploration and inference in a continuous space where the computation of gradients is possible. This continuous relaxation allows for major leaps across tree space in both rooted and unrooted trees, and is less susceptible to convergence to local minima. Our approach outperforms the current best methods for inference on unrooted trees and, in simulation, accurately infers the tree and root in ultrametric cases. The approach is effective in cases of empirical data with negligible amounts of data, which we demonstrate on the phylogeny of jawed vertebrates. Indeed, only a few genes with an ultrametric signal were generally sufficient for resolving the major lineages of vertebrates. Optimization is possible via automatic differentiation and our method presents an effective way forward for exploring the most difficult, data-deficient phylogenetic questions.
Collapse
Affiliation(s)
- Matthew J Penn
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Neil Scheidwasser
- Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
| | - Joseph Penn
- Department of Physics, University of Oxford, Oxford, United Kingdom
| | - Christl A Donnelly
- Department of Statistics, University of Oxford, Oxford, United Kingdom
- Pandemic Sciences Institute, University of Oxford, Oxford, United Kingdom
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - David A Duchêne
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Samir Bhatt
- Section of Epidemiology, University of Copenhagen, Copenhagen, Denmark
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
16
|
Kramer AM, Thornlow B, Ye C, De Maio N, McBroome J, Hinrichs AS, Lanfear R, Turakhia Y, Corbett-Detig R. Online Phylogenetics with matOptimize Produces Equivalent Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Implementations. Syst Biol 2023; 72:1039-1051. [PMID: 37232476 PMCID: PMC10627557 DOI: 10.1093/sysbio/syad031] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 05/14/2023] [Accepted: 06/22/2023] [Indexed: 05/27/2023] Open
Abstract
Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 data sets do not fit this mold. There are currently over 14 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) and pseudo-ML methods may be more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger data sets. Here, we evaluate the performance of de novo and online phylogenetic approaches, as well as ML, pseudo-ML, and MP frameworks for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimization with UShER and matOptimize produces equivalent SARS-CoV-2 phylogenies to some of the most popular ML and pseudo-ML inference tools. MP optimization with UShER and matOptimize is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo inference. Our results therefore suggest that parsimony-based methods like UShER and matOptimize represent an accurate and more practical alternative to established ML implementations for large SARS-CoV-2 phylogenies and could be successfully applied to other similar data sets with particularly dense sampling and short branch lengths.
Collapse
Affiliation(s)
- Alexander M Kramer
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cheng Ye
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Jakob McBroome
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
17
|
Sanderson T, Hisner R, Donovan-Banfield I, Hartman H, Løchen A, Peacock TP, Ruis C. A molnupiravir-associated mutational signature in global SARS-CoV-2 genomes. Nature 2023; 623:594-600. [PMID: 37748513 PMCID: PMC10651478 DOI: 10.1038/s41586-023-06649-6] [Citation(s) in RCA: 51] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 09/15/2023] [Indexed: 09/27/2023]
Abstract
Molnupiravir, an antiviral medication widely used against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), acts by inducing mutations in the virus genome during replication. Most random mutations are likely to be deleterious to the virus and many will be lethal; thus, molnupiravir-induced elevated mutation rates reduce viral load1,2. However, if some patients treated with molnupiravir do not fully clear the SARS-CoV-2 infections, there could be the potential for onward transmission of molnupiravir-mutated viruses. Here we show that SARS-CoV-2 sequencing databases contain extensive evidence of molnupiravir mutagenesis. Using a systematic approach, we find that a specific class of long phylogenetic branches, distinguished by a high proportion of G-to-A and C-to-T mutations, are found almost exclusively in sequences from 2022, after the introduction of molnupiravir treatment, and in countries and age groups with widespread use of the drug. We identify a mutational spectrum, with preferred nucleotide contexts, from viruses in patients known to have been treated with molnupiravir and show that its signature matches that seen in these long branches, in some cases with onward transmission of molnupiravir-derived lineages. Finally, we analyse treatment records to confirm a direct association between these high G-to-A branches and the use of molnupiravir.
Collapse
Affiliation(s)
| | - Ryan Hisner
- Department of Bioinformatics, University of Cape Town, Cape Town, South Africa
| | - I'ah Donovan-Banfield
- Department of Infection Biology and Microbiomes, Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, Liverpool, UK
- Health Protection Research Unit in Emerging and Zoonotic Infections, National Institute for Health and Care Research, Liverpool, UK
| | | | | | - Thomas P Peacock
- Department of Infectious Disease, Imperial College London, London, UK
- The Pirbright Institute, Pirbright, UK
| | - Christopher Ruis
- Molecular Immunity Unit, University of Cambridge Department of Medicine, Medical Research Council-Laboratory of Molecular Biology, Cambridge, UK.
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK.
- Cambridge Centre for AI in Medicine, University of Cambridge, Cambridge, UK.
- Victor Phillip Dahdaleh Heart & Lung Research Institute, University of Cambridge, Cambridge, UK.
| |
Collapse
|
18
|
Volz E. Fitness, growth and transmissibility of SARS-CoV-2 genetic variants. Nat Rev Genet 2023; 24:724-734. [PMID: 37328556 DOI: 10.1038/s41576-023-00610-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2023] [Indexed: 06/18/2023]
Abstract
The massive scale of the global SARS-CoV-2 sequencing effort created new opportunities and challenges for understanding SARS-CoV-2 evolution. Rapid detection and assessment of new variants has become one of the principal objectives of genomic surveillance of SARS-CoV-2. Because of the pace and scale of sequencing, new strategies have been developed for characterizing fitness and transmissibility of emerging variants. In this Review, I discuss a wide range of approaches that have been rapidly developed in response to the public health threat posed by emerging variants, ranging from new applications of classic population genetics models to contemporary synthesis of epidemiological models and phylodynamic analysis. Many of these approaches can be adapted to other pathogens and will have increasing relevance as large-scale pathogen sequencing becomes a regular feature of many public health systems.
Collapse
Affiliation(s)
- Erik Volz
- Department of Infectious Disease Epidemiology, MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, UK.
| |
Collapse
|
19
|
Smith K, Ye C, Turakhia Y. Tracking and curating putative SARS-CoV-2 recombinants with RIVET. Bioinformatics 2023; 39:btad538. [PMID: 37651464 PMCID: PMC10493179 DOI: 10.1093/bioinformatics/btad538] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Revised: 07/15/2023] [Accepted: 08/30/2023] [Indexed: 09/02/2023] Open
Abstract
MOTIVATION Identifying and tracking recombinant strains of SARS-CoV-2 is critical to understanding the evolution of the virus and controlling its spread. But confidently identifying SARS-CoV-2 recombinants from thousands of new genome sequences that are being shared online every day is quite challenging, causing many recombinants to be missed or suffer from weeks of delay in being formally identified while undergoing expert curation. RESULTS We present RIVET-a software pipeline and visual platform that takes advantage of recent algorithmic advances in recombination inference to comprehensively and sensitively search for potential SARS-CoV-2 recombinants and organize the relevant information in a web interface that would help greatly accelerate the process of identifying and tracking recombinants. AVAILABILITY AND IMPLEMENTATION RIVET-based web interface displaying the most updated analysis of potential SARS-CoV-2 recombinants is available at https://rivet.ucsd.edu/. RIVET's frontend and backend code is freely available under the MIT license at https://github.com/TurakhiaLab/rivet and the documentation for RIVET is available at https://turakhialab.github.io/rivet/. The inputs necessary for running RIVET's backend workflow for SARS-CoV-2 are available through a public database maintained and updated daily by UCSC (https://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/).
Collapse
Affiliation(s)
- Kyle Smith
- Department of Biological Sciences, University of California, San Diego, San Diego, CA 92093, United States
| | - Cheng Ye
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA 92093, United States
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA 92093, United States
| |
Collapse
|
20
|
Chakraborty C, Bhattacharya M, Saikumar G, Alshammari A, Alharbi M, Lee SS, Dhama K. A European perspective of phylogenomics, sublineages, geographical distribution, epidemiology, and mutational landscape of mpox virus: Emergence pattern may help to fight the next public health emergency in Europe. J Infect Public Health 2023; 16:1004-1014. [PMID: 37172461 PMCID: PMC10147450 DOI: 10.1016/j.jiph.2023.04.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Revised: 04/09/2023] [Accepted: 04/26/2023] [Indexed: 05/15/2023] Open
Abstract
BACKGROUND The 2022 outbreak of the mpox virus (previously monkeypox virus, MPXV) in non-epidemic regions has created a global issue. The emergence of MPXV was first reported in Europe, which was described as the MPXV epicenter, however, no reports are available to illustrate its outbreak patterns in Europe. METHODS The study used numerous in silico and statistical methods to analyze hMPXV1 in European countries. Here, we used different bioinformatics servers and software to evaluate the spread of hMPXV1 in European countries. For analysis, we use various advanced servers like Nextstrain, Taxonium, MpoxSpectrum, etc. Similarly, for the statistical model, we used PAST software. RESULTS The phylogenetic tree was depicted to illustrate the origin and evolution of hMPXV1 using vas number of genome sequences (n = 675). We found several sublineages in Europe, indicating microevolution. The scatter plot reveals the clustering patterns of the newly developed lineages in Europe. We developed statistical models for the monthly total relative frequency counts of these sublineages. The epidemiology of MPX in Europe was examined in an attempt to capture the epidemiological pattern, total cases, and deaths. Our Study noted the highest number of cases was in Spain (7500 cases) and the second-highest in France (4114 cases). The third highest number of cases was in the UK (3730 cases), which was very similar to Germany (3677 cases). Finally, we noted the mutational landscape throughout European genomes. Significant mutations were observed at the nucleotide and protein levels. We identified several unique homoplastic mutations in Europe. CONCLUSION This study reveals several essential aspects of the European outbreak. It might help to eradicate the virus in Europe, assist in strategy formation to fight against the virus, and support working against the next public health emergency in Europe.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal 700126, India.
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore 756020, Odisha, India
| | - G Saikumar
- Division of Pathology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, Uttar Pradesh, India
| | - Abdulrahman Alshammari
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Post Box 2455, Riyadh 11451, Saudi Arabia
| | - Metab Alharbi
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Post Box 2455, Riyadh 11451, Saudi Arabia
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopaedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si 24252, Gangwon-do, Republic of Korea
| | - Kuldeep Dhama
- Division of Pathology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly 243122, Uttar Pradesh, India
| |
Collapse
|
21
|
Mixão V, Pinto M, Sobral D, Di Pasquale A, Gomes JP, Borges V. ReporTree: a surveillance-oriented tool to strengthen the linkage between pathogen genetic clusters and epidemiological data. Genome Med 2023; 15:43. [PMID: 37322495 PMCID: PMC10273728 DOI: 10.1186/s13073-023-01196-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/23/2023] [Indexed: 06/17/2023] Open
Abstract
BACKGROUND Genomics-informed pathogen surveillance strengthens public health decision-making, playing an important role in infectious diseases' prevention and control. A pivotal outcome of genomics surveillance is the identification of pathogen genetic clusters and their characterization in terms of geotemporal spread or linkage to clinical and demographic data. This task often consists of the visual exploration of (large) phylogenetic trees and associated metadata, being time-consuming and difficult to reproduce. RESULTS We developed ReporTree, a flexible bioinformatics pipeline that allows diving into the complexity of pathogen diversity to rapidly identify genetic clusters at any (or all) distance threshold(s) or cluster stability regions and to generate surveillance-oriented reports based on the available metadata, such as timespan, geography, or vaccination/clinical status. ReporTree is able to maintain cluster nomenclature in subsequent analyses and to generate a nomenclature code combining cluster information at different hierarchical levels, thus facilitating the active surveillance of clusters of interest. By handling several input formats and clustering methods, ReporTree is applicable to multiple pathogens, constituting a flexible resource that can be smoothly deployed in routine surveillance bioinformatics workflows with negligible computational and time costs. This is demonstrated through a comprehensive benchmarking of (i) the cg/wgMLST workflow with large datasets of four foodborne bacterial pathogens and (ii) the alignment-based SNP workflow with a large dataset of Mycobacterium tuberculosis. To further validate this tool, we reproduced a previous large-scale study on Neisseria gonorrhoeae, demonstrating how ReporTree is able to rapidly identify the main species genogroups and characterize them with key surveillance metadata, such as antibiotic resistance data. By providing examples for SARS-CoV-2 and the foodborne bacterial pathogen Listeria monocytogenes, we show how this tool is currently a useful asset in genomics-informed routine surveillance and outbreak detection of a wide variety of species. CONCLUSIONS In summary, ReporTree is a pan-pathogen tool for automated and reproducible identification and characterization of genetic clusters that contributes to a sustainable and efficient public health genomics-informed pathogen surveillance. ReporTree is implemented in python 3.8 and is freely available at https://github.com/insapathogenomics/ReporTree .
Collapse
Affiliation(s)
- Verónica Mixão
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Miguel Pinto
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Daniel Sobral
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Adriano Di Pasquale
- National Reference Centre (NRC) for Whole Genome Sequencing of Microbial Pathogens: Database and Bioinformatics analysis (GENPAT), Istituto Zooprofilattico Sperimentale Dell'Abruzzo E del Molise "Giuseppe Caporale" (IZSAM), Teramo, Italy
| | - João Paulo Gomes
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Vítor Borges
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal.
| |
Collapse
|
22
|
Chen C, Taepper A, Engelniederhammer F, Kellerer J, Roemer C, Stadler T. LAPIS is a fast web API for massive open virus sequencing data. BMC Bioinformatics 2023; 24:232. [PMID: 37277732 DOI: 10.1186/s12859-023-05364-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 05/23/2023] [Indexed: 06/07/2023] Open
Abstract
BACKGROUND Recent epidemic outbreaks such as the SARS-CoV-2 pandemic and the mpox outbreak in 2022 have demonstrated the value of genomic sequencing data for tracking the origin and spread of pathogens. Laboratories around the globe generated new sequences at unprecedented speed and volume and bioinformaticians developed new tools and dashboards to analyze this wealth of data. However, a major challenge that remains is the lack of simple and efficient approaches for accessing and processing sequencing data. RESULTS The Lightweight API for Sequences (LAPIS) facilitates rapid retrieval and analysis of genomic sequencing data through a REST API. It supports complex mutation- and metadata-based queries and can perform aggregation operations on massive datasets. LAPIS is optimized for typical questions relevant to genomic epidemiology. Using a newly-developed in-memory database engine, it has a high speed and throughput: between 25 January and 4 February 2023, the SARS-CoV-2 instance of LAPIS, which contains 14.5 million sequences, processed over 20 million requests with a mean response time of 411 ms and a median response time of 1 ms. LAPIS is the core engine behind our dashboards on genspectrum.org and we currently maintain public LAPIS instances for SARS-CoV-2 and mpox. CONCLUSIONS Powered by an optimized database engine and available through a web API, LAPIS enhances the accessibility of genomic sequencing data. It is designed to serve as a common backend for dashboards and analyses with the potential to be integrated into common database platforms such as GenBank.
Collapse
Affiliation(s)
- Chaoran Chen
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
- Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Alexander Taepper
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- School of Computation, Information and Technology - Informatics, TU Munich, Munich, Germany
| | | | | | - Cornelius Roemer
- Swiss Institute of Bioinformatics, Basel, Switzerland
- Biozentrum, University of Basel, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
- Swiss Institute of Bioinformatics, Basel, Switzerland.
| |
Collapse
|
23
|
Martínez-Martínez FJ, Massinga AJ, De Jesus Á, Ernesto RM, Cano-Jiménez P, Chiner-Oms Á, Gómez-Navarro I, Guillot-Fernández M, Guinovart C, Sitoe A, Vubil D, Bila R, Gujamo R, Enosse S, Jiménez-Serrano S, Torres-Puente M, Comas I, Mandomando I, López MG, Mayor A. Tracking SARS-CoV-2 introductions in Mozambique using pandemic-scale phylogenies: a retrospective observational study. Lancet Glob Health 2023; 11:e933-e941. [PMID: 37202028 DOI: 10.1016/s2214-109x(23)00169-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 03/09/2023] [Accepted: 03/23/2023] [Indexed: 05/20/2023]
Abstract
BACKGROUND From the start of the SARS-CoV-2 outbreak, global sequencing efforts have generated an unprecedented amount of genomic data. Nonetheless, unequal sampling between high-income and low-income countries hinders the implementation of genomic surveillance systems at the global and local level. Filling the knowledge gaps of genomic information and understanding pandemic dynamics in low-income countries is essential for public health decision making and to prepare for future pandemics. In this context, we aimed to discover the timing and origin of SARS-CoV-2 variant introductions in Mozambique, taking advantage of pandemic-scale phylogenies. METHODS We did a retrospective, observational study in southern Mozambique. Patients from Manhiça presenting with respiratory symptoms were recruited, and those enrolled in clinical trials were excluded. Data were included from three sources: (1) a prospective hospital-based surveillance study (MozCOVID), recruiting patients living in Manhiça, attending the Manhiça district hospital, and fulfilling the criteria of suspected COVID-19 case according to WHO; (2) symptomatic and asymptomatic individuals with SARS-CoV-2 infection recruited by the National Surveillance system; and (3) sequences from SARS-CoV-2-infected Mozambican cases deposited on the Global Initiative on Sharing Avian Influenza Data database. Positive samples amenable for sequencing were analysed. We used Ultrafast Sample placement on Existing tRees to understand the dynamics of beta and delta waves, using available genomic data. This tool can reconstruct a phylogeny with millions of sequences by efficient sample placement in a tree. We reconstructed a phylogeny (~7·6 million sequences) adding new and publicly available beta and delta sequences. FINDINGS A total of 5793 patients were recruited between Nov 1, 2020, and Aug 31, 2021. During this time, 133 328 COVID-19 cases were reported in Mozambique. 280 good quality new SARS-CoV-2 sequences were obtained after the inclusion criteria were applied and an additional 652 beta (B.1.351) and delta (B.1.617.2) public sequences were included from Mozambique. We evaluated 373 beta and 559 delta sequences. We identified 187 beta introductions (including 295 sequences), divided in 42 transmission groups and 145 unique introductions, mostly from South Africa, between August, 2020 and July, 2021. For delta, we identified 220 introductions (including 494 sequences), with 49 transmission groups and 171 unique introductions, mostly from the UK, India, and South Africa, between April and November, 2021. INTERPRETATION The timing and origin of introductions suggests that movement restrictions effectively avoided introductions from non-African countries, but not from surrounding countries. Our results raise questions about the imbalance between the consequences of restrictions and health benefits. This new understanding of pandemic dynamics in Mozambique can be used to inform public health interventions to control the spread of new variants. FUNDING European and Developing Countries Clinical Trials, European Research Council, Bill & Melinda Gates Foundation, and Agència de Gestió d'Ajuts Universitaris i de Recerca.
Collapse
Affiliation(s)
- Francisco José Martínez-Martínez
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | | | - Áuria De Jesus
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique
| | - Rita M Ernesto
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique
| | - Pablo Cano-Jiménez
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | - Álvaro Chiner-Oms
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | - Inmaculada Gómez-Navarro
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | - Marina Guillot-Fernández
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | | | - António Sitoe
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique
| | - Delfino Vubil
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique
| | - Rubão Bila
- Hospital Distrital da Manhiça, Marracuene, Mozambique
| | | | - Sónia Enosse
- Instituto Nacional de Saúde, Marracuene, Mozambique
| | - Santiago Jiménez-Serrano
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | - Manuela Torres-Puente
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain
| | - Iñaki Comas
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain; Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| | - Inácio Mandomando
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique; Instituto Nacional de Saúde, Marracuene, Mozambique
| | - Mariana G López
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain.
| | - Alfredo Mayor
- Centro de Investigação em Saúde de Manhiça, Maputo, Mozambique; ISGlobal, Hospital Clínic - Universitat de Barcelona, Barcelona, Spain; Centro de Investigación Biomédica en Red en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain; Department of Physiologic Sciences, Faculty of Medicine, Universidade Eduardo Mondlane, Maputo, Mozambique
| |
Collapse
|
24
|
Cheng Y, Ji C, Zhou HY, Zheng H, Wu A. Web Resources for SARS-CoV-2 Genomic Database, Annotation, Analysis and Variant Tracking. Viruses 2023; 15:1158. [PMID: 37243244 PMCID: PMC10222785 DOI: 10.3390/v15051158] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 05/10/2023] [Accepted: 05/10/2023] [Indexed: 05/28/2023] Open
Abstract
The SARS-CoV-2 genomic data continue to grow, providing valuable information for researchers and public health officials. Genomic analysis of these data sheds light on the transmission and evolution of the virus. To aid in SARS-CoV-2 genomic analysis, many web resources have been developed to store, collate, analyze, and visualize the genomic data. This review summarizes web resources used for the SARS-CoV-2 genomic epidemiology, covering data management and sharing, genomic annotation, analysis, and variant tracking. The challenges and further expectations for these web resources are also discussed. Finally, we highlight the importance and need for continued development and improvement of related web resources to effectively track the spread and understand the evolution of the virus.
Collapse
Affiliation(s)
- Yexiao Cheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 211100, China
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005, China
- Suzhou Institute of Systems Medicine, Suzhou 215123, China
| | - Chengyang Ji
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005, China
- Suzhou Institute of Systems Medicine, Suzhou 215123, China
| | - Hang-Yu Zhou
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005, China
- Suzhou Institute of Systems Medicine, Suzhou 215123, China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 211100, China
| | - Aiping Wu
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005, China
- Suzhou Institute of Systems Medicine, Suzhou 215123, China
| |
Collapse
|
25
|
De Maio N, Kalaghatgi P, Turakhia Y, Corbett-Detig R, Minh BQ, Goldman N. Maximum likelihood pandemic-scale phylogenetics. Nat Genet 2023; 55:746-752. [PMID: 37038003 PMCID: PMC10181937 DOI: 10.1038/s41588-023-01368-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 03/07/2023] [Indexed: 04/12/2023]
Abstract
Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's origins, spread, and the emergence and reproductive success of new variants. However, most phylogenetic approaches, including maximum likelihood and Bayesian methods, cannot scale to the size of the datasets from the current pandemic. We present 'MAximum Parsimonious Likelihood Estimation' (MAPLE), an approach for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. MAPLE infers SARS-CoV-2 phylogenies more accurately than existing maximum likelihood approaches while running up to thousands of times faster, and requiring at least 100 times less memory on large datasets. This extends the reach of genomic epidemiology, allowing the continued use of accurate phylogenetic, phylogeographic and phylodynamic analyses on datasets of millions of genomes.
Collapse
Affiliation(s)
- Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK.
| | | | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Bui Quang Minh
- School of Computing, College of Engineering, Computing and Cybernetics, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| |
Collapse
|
26
|
Kramer AM, Sanderson T, Corbett-Detig R. Treenome Browser: co-visualization of enormous phylogenies and millions of genomes. Bioinformatics 2023; 39:btac772. [PMID: 36453872 PMCID: PMC9805588 DOI: 10.1093/bioinformatics/btac772] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 11/22/2022] [Accepted: 11/29/2022] [Indexed: 12/03/2022] Open
Abstract
SUMMARY Treenome Browser is a web browser tool to interactively visualize millions of genomes alongside huge phylogenetic trees. AVAILABILITY AND IMPLEMENTATION Treenome Browser for SARS-CoV-2 can be accessed at cov2tree.org, or at taxonium.org for user-provided trees. Source code and documentation are available at github.com/theosanderson/taxonium and docs.taxonium.org/en/latest/treenome.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alexander M Kramer
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|