1
|
Haddox HK, Angehrn G, Sesta L, Jennings-Shaffer C, Temple SD, Galloway JG, DeWitt WS, Bloom JD, Matsen FA, Neher RA. The mutation rate of SARS-CoV-2 is highly variable between sites and is influenced by sequence context, genomic region, and RNA structure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.07.631013. [PMID: 39829847 PMCID: PMC11741320 DOI: 10.1101/2025.01.07.631013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2025]
Abstract
RNA viruses like SARS-CoV-2 have a high mutation rate, which contributes to their rapid evolution. The rate of mutations depends on the mutation type (e.g., A→C, A→G, etc.) and can vary between sites in the viral genome. Understanding this variation can shed light on the mutational processes at play, and is crucial for quantitative modeling of viral evolution. Using the millions of available SARS-CoV-2 full-genome sequences, we estimate rates of synonymous mutations for all 12 possible nucleotide mutation types and examine how much these rates vary between sites. We find a surprisingly high level of variability and several striking patterns: the rates of four mutation types suddenly increase at one of two gene boundaries; the rates of most mutation types strongly depend on a site's local sequence context, with up to 56-fold differences between contexts; consistent with a previous study, the rates of some mutation types are lower at sites engaged in RNA secondary structure. A simple log-linear model of these features explains ~15-60% of the fold-variation of mutation rates between sites, depending on mutation type; more complex models only modestly improve predictive power out of sample. We estimate the fitness effect of each mutation based on the number of times it actually occurs versus the number of times it is expected to occur based on the model. We identify several small regions of the genome where synonymous or noncoding mutations occur much less often than expected, indicative of strong purifying selection on the RNA sequence that is independent of protein sequence. Overall, this work expands our basic understanding of SARS-CoV-2's evolution by characterizing the virus's mutation process at the level of individual sites and uncovering several striking mutational patterns that arise from unknown mechanisms.
Collapse
Affiliation(s)
- Hugh K Haddox
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Luca Sesta
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | - Seth D Temple
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Statistics, University of Washington, Seattle, WA, USA
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
- Michigan Institute for Data & AI in Society, University of Michigan, Ann Arbor, MI, USA
| | - Jared G Galloway
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - William S DeWitt
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Jesse D Bloom
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Frederick A Matsen
- Computational Biology Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Statistics, University of Washington, Seattle, WA, USA
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
2
|
Simmonds P. C→U transition biases in SARS-CoV-2: still rampant 4 years from the start of the COVID-19 pandemic. mBio 2024; 15:e0249324. [PMID: 39475243 PMCID: PMC11633203 DOI: 10.1128/mbio.02493-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Accepted: 09/24/2024] [Indexed: 12/12/2024] Open
Abstract
The evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the pandemic and post-pandemic periods has been characterized by rapid adaptive changes that confer immune escape and enhanced human-to-human transmissibility. Sequence change is additionally marked by an excess number of C→U transitions suggested as being due to host-mediated genome editing. To investigate how these influence the evolutionary trajectory of SARS-CoV-2, 2,000 high-quality, coding complete genome sequences of SARS-CoV-2 variants collected pre-September 2020 and from each subsequently appearing alpha, delta, BA.1, BA.2, BA.5, XBB, EG, HK, and JN.1 lineages were downloaded from NCBI Virus in April 2024. C→U transitions were the most common substitution during the diversification of SARS-CoV-2 lineages over the 4-year observation period. A net loss of C bases and accumulation of U's occurred at a constant rate of approximately 0.2%-0.25%/decade. C→U transitions occurred in over a quarter of all sites with a C (26.5%; range 20.0%-37.2%) around five times more than observed for the other transitions (5.3%-6.8%). In contrast to an approximately random distribution of other transitions across the genome, most C→U substitutions occurred at statistically preferred sites in each lineage. However, only the most C→U polymorphic sites showed evidence for a preferred 5'U context previously associated with APOBEC 3A editing. There was a similarly weak preference for unpaired bases suggesting much less stringent targeting of RNA than mediated by A3 deaminases in DNA editing. Future functional studies are required to determine editing preferences, impacts on replication fitness in vivo of SARS-CoV-2 and other RNA viruses, and impact on host tropism. IMPORTANCE Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in the pandemic and post-pandemic periods has shown a remarkable capacity to adapt and evade human immune responses and increase its human-to-human transmissibility. The genome of SARS-CoV-2 is also increasingly scarred by the effects of multiple C→U mutations from host genome editing as a cellular defense mechanism akin to restriction factors for retroviruses. Through the analysis of large data sets of SARS-CoV-2 isolate sequences collected throughout the pandemic period and beyond, we show that C→U transitions have driven a base compositional change over time amounting to a net loss of C bases and accumulation of U's at a rate of approximately 0.2%-0.25%/decade. Most C→U substitutions occurred in the absence of the preferred upstream-base context or targeting of unpaired RNA bases previously associated with the host RNA editing protein, APOBEC 3A. The analyses provide a series of testable hypotheses that can be experimentally investigated in the future.
Collapse
Affiliation(s)
- Peter Simmonds
- Nuffield Department of Medicine, Peter Medawar Building for Pathogen Research, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
3
|
Hunt M, Hinrichs AS, Anderson D, Karim L, Dearlove BL, Knaggs J, Constantinides B, Fowler PW, Rodger G, Street T, Lumley S, Webster H, Sanderson T, Ruis C, Kotzen B, de Maio N, Amenga-Etego LN, Amuzu DSY, Avaro M, Awandare GA, Ayivor-Djanie R, Barkham T, Bashton M, Batty EM, Bediako Y, De Belder D, Benedetti E, Bergthaler A, Boers SA, Campos J, Carr RAA, Chen YYC, Cuba F, Dattero ME, Dejnirattisai W, Dilthey A, Duedu KO, Endler L, Engelmann I, Francisco NM, Fuchs J, Gnimpieba EZ, Groc S, Gyamfi J, Heemskerk D, Houwaart T, Hsiao NY, Huska M, Hölzer M, Iranzadeh A, Jarva H, Jeewandara C, Jolly B, Joseph R, Kant R, Ki KKK, Kurkela S, Lappalainen M, Lataretu M, Lemieux J, Liu C, Malavige GN, Mashe T, Mongkolsapaya J, Montes B, Mora JAM, Morang'a CM, Mvula B, Nagarajan N, Nelson A, Ngoi JM, da Paixão JP, Panning M, Poklepovich T, Quashie PK, Ranasinghe D, Russo M, San JE, Sanderson ND, Scaria V, Screaton G, Sessions OM, Sironen T, Sisay A, Smith D, Smura T, Supasa P, Suphavilai C, Swann J, Tegally H, Tegomoh B, Vapalahti O, Walker A, Wilkinson RJ, Williamson C, Zair X, de Oliveira T, Peto TE, Crook D, Corbett-Detig R, Iqbal Z. Addressing pandemic-wide systematic errors in the SARS-CoV-2 phylogeny. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.29.591666. [PMID: 38746185 PMCID: PMC11092452 DOI: 10.1101/2024.04.29.591666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The SARS-CoV-2 genome occupies a unique place in infection biology - it is the most highly sequenced genome on earth (making up over 20% of public sequencing datasets) with fine scale information on sampling date and geography, and has been subject to unprecedented intense analysis. As a result, these phylogenetic data are an incredibly valuable resource for science and public health. However, the vast majority of the data was sequenced by tiling amplicons across the full genome, with amplicon schemes that changed over the pandemic as mutations in the viral genome interacted with primer binding sites. In combination with the disparate set of genome assembly workflows and lack of consistent quality control (QC) processes, the current genomes have many systematic errors that have evolved with the virus and amplicon schemes. These errors have significant impacts on the phylogeny, and therefore over the last few years, many thousands of hours of researchers time has been spent in "eyeballing" trees, looking for artefacts, and then patching the tree. Given the huge value of this dataset, we therefore set out to reprocess the complete set of public raw sequence data in a rigorous amplicon-aware manner, and build a cleaner phylogeny. Here we provide a global tree of 4,471,579 samples, built from a consistently assembled set of high quality consensus sequences from all available public data as of June 2024, viewable at https://viridian.taxonium.org. Each genome was constructed using a novel assembly tool called Viridian (https://github.com/iqbal-lab-org/viridian), developed specifically to process amplicon sequence data, eliminating artefactual errors and mask the genome at low quality positions. We provide simulation and empirical validation of the methodology, and quantify the improvement in the phylogeny. We hope the tree, consensus sequences and Viridian will be a valuable resource for researchers.
Collapse
Affiliation(s)
- Martin Hunt
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Angie S Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
| | - Daniel Anderson
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lily Karim
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Bethany L Dearlove
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Jeff Knaggs
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Bede Constantinides
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Philip W Fowler
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Gillian Rodger
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Health Protection Research Unit in Healthcare Associated Infections and Antimicrobial Resistance, University of Oxford, Oxford, UK
| | - Teresa Street
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Sheila Lumley
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Department of Infectious Diseases and Microbiology, John Radcliffe Hospital, Oxford, UK
| | - Hermione Webster
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Christopher Ruis
- Victor Phillip Dahdaleh Heart & Lung Research Institute, University of Cambridge, Cambridge, UK
- Department of Veterinary Medicine, University of Cambridge, Cambridge, UK
| | - Benjamin Kotzen
- Department of Infectious Diseases, Massachusetts General Hospital., Boston, Massachusetts, USA
| | - Nicola de Maio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
| | - Lucas N Amenga-Etego
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Dominic S Y Amuzu
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Martin Avaro
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Gordon A Awandare
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Reuben Ayivor-Djanie
- Laboratory for Medical Biotechnology and Biomanufacturing, International Centre for Genetic Engineering and Biotechnology, Tristie, Italy
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
| | | | - Matthew Bashton
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Elizabeth M Batty
- Centre for Tropical Medicine and Global Health, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
| | - Yaw Bediako
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Denise De Belder
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Estefania Benedetti
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Andreas Bergthaler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Stefan A Boers
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Josefina Campos
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Rosina Afua Ampomah Carr
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- Department of Computational Medicine and Bioinformatics, University of Michigan, Michigan, Ann Arbor, MI, USA
| | | | - Facundo Cuba
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Maria Elena Dattero
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Wanwisa Dejnirattisai
- Division of Emerging Infectious Disease, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkoknoi, Bangkok 10700, Thailand
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Kwabena Obeng Duedu
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- College of Life Sciences, Birmingham City University, Birmingham, UK
| | - Lukas Endler
- Institute for Hygiene and Applied Immunology, Center for Pathophysiology, Infectiology and Immunology, Medical University of Vienna, Vienna 1090, Austria
| | - Ilka Engelmann
- Pathogenesis and Control of Chronic and Emerging Infections, Univ Montpellier, INSERM, Etablissement Français du Sang, Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Ngiambudulu M Francisco
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Jonas Fuchs
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Etienne Z Gnimpieba
- Biomedical Engineering Department, University of South Dakota, Sioux Falls, SD 57107
| | - Soraya Groc
- Virology Laboratory, CHU Montpellier, Montpellier, France
| | - Jones Gyamfi
- Department of Biomedical Sciences, University of Health and Allied Sciences, Ho, Ghana
- School of Health and Life Sciences, Teesside University, Middlesbrough, UK
| | - Dennis Heemskerk
- Dept. Medical Microbiology, Leiden University Medical Center, Albinusdreef 2, 2333 ZA, Leiden, The Netherlands
| | - Torsten Houwaart
- Institute of Medical Microbiology and Hospital Hygiene, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Nei-Yuan Hsiao
- Divison of Medical Virology, University of Cape Town and National Health Laboratory Service
| | - Matthew Huska
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | | | - Hanna Jarva
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Chandima Jeewandara
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Bani Jolly
- Karkinos Healthcare Private Limited (KHPL), Aurbis Business Parks, Bellandur, Bengaluru, Karnataka, 560103, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
| | | | - Ravi Kant
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- Department of Tropical Parasitology, Institute of Maritime and Tropical Medicine, Medical University of Gdansk, 81-519 Gdynia, Poland
| | | | - Satu Kurkela
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Maija Lappalainen
- HUS Diagnostic Center, Clinical Microbiology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Marie Lataretu
- Genome Competence Center (MF1), Robert Koch Institute, Nordufer 20, 13353 Berlin, Germany
| | - Jacob Lemieux
- Department of Infectious Diseases, Massachusetts General Hospital., Boston, Massachusetts, USA
| | - Chang Liu
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Gathsaurie Neelika Malavige
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Tapfumanei Mashe
- Health System Strengthening Unit, World Health Organisation, Harare, Zimbabwe
| | - Juthathip Mongkolsapaya
- Mahidol-Oxford Tropical Medicine Research Unit, Bangkok, Thailand
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Jose Arturo Molina Mora
- Centro de investigación en Enfermedades Tropicales & Facultad de Microbiología, Universidad de Costa Rica, Costa Rica
| | - Collins M Morang'a
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Bernard Mvula
- Public Health Institute of Malawi, Ministry of Health, Malawi
| | - Niranjan Nagarajan
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Andrew Nelson
- Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Joyce M Ngoi
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Joana Paula da Paixão
- Grupo de Investigação Microbiana e Imunológica, Instituto Nacional de Investigação em Saúde (National Institute for Health Research), Luanda, Angola
| | - Marcus Panning
- Institute of Virology, Freiburg University Medical Center, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Tomas Poklepovich
- Unidad Operativa Centro Nacional de Genómica y Bioinformática, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - Peter K Quashie
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Diyanath Ranasinghe
- Allergy Immunology and Cell Biology Unit, Department of Immunology and Molecular Medicine, University of Sri Jayewardenepura, Nugegoda, Sri Lanka
| | - Mara Russo
- Servicio de Virus Respiratorios, Instituto Nacional Enfermedades Infecciosas, ANLIS "Dr. Carlos G. Malbrán", Buenos Aires, Argentina
| | - James Emmanuel San
- Duke Human Vaccine Institute, Duke University, Durham, NC 27710
- University of KwaZulu Natal, Durban, South Africa, 4001
| | - Nicholas D Sanderson
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
- National Institute of Health Research Oxford Biomedical Research Centre, John Radcliffe Hospital, Headley Way, Oxford, UK
| | - Vinod Scaria
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh, India
- Vishwanath Cancer Care Foundation (VCCF), Neelkanth Business Park Kirol Village, West Mumbai, Maharashtra, 400086, India
| | - Gavin Screaton
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | | | - Tarja Sironen
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Abay Sisay
- Department of Medical Laboratory Sciences, College of Health Sciences, Addis Ababa University, P.O.Box 1176, Addis Ababa, Ethiopia
| | - Darren Smith
- The Hub for Biotechnology in the Built Environment, Department of Applied Sciences, Faculty of Health and Life Sciences, Northumbria University, Newcastle upon Tyne, NE1 8ST, UK
| | - Teemu Smura
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Piyada Supasa
- Chinese Academy of Medical Science (CAMS) Oxford Institute (COI), University of Oxford, Oxford, UK
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Chayaporn Suphavilai
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), Singapore
| | - Jeremy Swann
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Houriiyah Tegally
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
| | - Bryan Tegomoh
- Centre de Coordination des Opérations d'Urgences de Santé Publique, Ministere de Sante Publique, Cameroun
- University of California, Berkeley, Berkeley, California, USA
- Nebraska Department of Health and Human Services, Lincoln, Nebraska, USA
| | - Olli Vapalahti
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Andreas Walker
- Institute of Virology, University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Robert J Wilkinson
- Francis Crick Institute, London, UK
- Centre for Infectious Diseases Research in Africa, University of Cape Town
- Imperial College London, UK
| | | | - Xavier Zair
- Saw Swee Hock School of Public Health, National Univeristy of Singapore
| | - Tulio de Oliveira
- Centre for Epidemic Response and Innovation (CERI), Stellenbosch University, South Africa
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), University of KwaZulu-Natal, South Africa
| | - Timothy Ea Peto
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Derrick Crook
- Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Russell Corbett-Detig
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA
| | - Zamin Iqbal
- European Molecular Biology Laboratory - European Bioinformatics Institute, Hinxton, UK
- Milner Centre for Evolution, University of Bath, UK
| |
Collapse
|
4
|
Xia X. How Trustworthy Are the Genomic Sequences of SARS-CoV-2 in GenBank? Microorganisms 2024; 12:2187. [PMID: 39597576 PMCID: PMC11596409 DOI: 10.3390/microorganisms12112187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Revised: 10/27/2024] [Accepted: 10/27/2024] [Indexed: 11/29/2024] Open
Abstract
Well-annotated gene and genomic sequences serve as a foundation for making inferences in molecular biology and evolution and can directly impact public health. The first SARS-CoV-2 genome was submitted to the GenBank database hosted by the U.S. National Center for Biotechnology Information and used to develop the two successful vaccines. Conserved protein domains are often chosen as targets for developing antiviral medicines or vaccines. Mutation and substitution patterns provide crucial information not only on functional motifs and genome/protein interactions but also for characterizing phylogenetic relationships among viral strains. These patterns, together with the collection time of viral samples, serve as the basis for addressing the question of when and where the host-switching event occurred. Unfortunately, viral genomic sequences submitted to GenBank undergo little quality control, and critical information in the annotation is frequently changed without being recorded. Researchers often have no choice but to hold blind faith in the authenticity of the sequences. There have been reports of incorrect genome annotation but no report that casts doubt on the genomic sequences themselves because it seems theoretically impossible to identify genomic sequences that may not be authentic. This paper takes an innovative approach to show that some SARS-CoV-2 genomes submitted to GenBank cannot possibly be authentic. Specifically, some SARS-CoV-2 genomic sequences deposited in GenBank with collection times in 2023 and 2024, isolated from saliva, nasopharyngeal, sewage, and stool, are identical to the reference genome of SARS-CoV-2 (NC_045512). The probability of such occurrence is effectively 0. I also compile SARS-CoV-2 genomes with changed sample collection times. One may be led astray in bioinformatic analysis without being aware of errors in sequences and sequence annotation.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology, University of Ottawa, Marie-Curie Private, Ottawa, ON K1N 6N5, Canada;
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, ON K1H 8M5, Canada
| |
Collapse
|
5
|
Ghaemi S, Abdoli A, Karimi H, Saadatpour F, Arefian E. The impact of host microRNAs on the development of conserved mutations of SARS-CoV-2. Sci Rep 2024; 14:22091. [PMID: 39333651 PMCID: PMC11437047 DOI: 10.1038/s41598-024-70974-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/22/2024] [Indexed: 09/29/2024] Open
Abstract
SARS-CoV-2, the virus responsible for the COVID-19 pandemic, has undergone various genetic alterations due to evolutionary pressures exerted by host cells, including intracellular antiviral mechanisms such as targeting by human microRNAs (miRNAs). This study investigates the impact of miRNAs hsa-miR-3132 and hsa-miR-4650 on the viral genome. Sequence alignment revealed conserved mutations in the binding sites of these miRNAs in adapted strains compared to the original Wuhan-Hu-1 strain, leading to their deletion. Despite modest expression of these miRNAs in SARS-CoV-2 target tissues, their efficacy against mutant strains is reduced due to the loss of binding sites. Structural analysis indicates that the mutant genome is more stable than the Wuhan-Hu-1 genome. Luciferase and virus titration assays demonstrate that hsa-miR-3132 and hsa-miR-4650 effectively target the Nsp3 gene in the Wuhan-Hu-1 strain but not in mutant strains lacking their binding sites. These findings suggest that the observed mutations help the virus evade selective pressure from human miRNAs, contributing to its adaptation.
Collapse
Affiliation(s)
- Shokoofeh Ghaemi
- Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran
| | - Asghar Abdoli
- Department of Hepatitis and AIDS, Pasteur Institute of Iran, Tehran, Iran
- Amirabad Virology Laboratory, Vaccine Unit, Tehran, 1413693341, Iran
| | - Hesam Karimi
- Department of Virology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Fatemeh Saadatpour
- Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran
| | - Ehsan Arefian
- Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran.
- Stem Cells Technology and Tissue Regeneration Department, School of Biology, College of Science, University of Tehran, Tehran, Iran.
| |
Collapse
|
6
|
Boon WX, Sia BZ, Ng CH. Prediction of the effects of the top 10 synonymous mutations from 26645 SARS-CoV-2 genomes of early pandemic phase. F1000Res 2024; 10:1053. [PMID: 39268187 PMCID: PMC11391198 DOI: 10.12688/f1000research.72896.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/11/2024] [Indexed: 09/15/2024] Open
Abstract
Background The emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) had led to a global pandemic since December 2019. SARS-CoV-2 is a single-stranded RNA virus, which mutates at a higher rate. Multiple works had been done to study nonsynonymous mutations, which change protein sequences. However, there is little study on the effects of SARS-CoV-2 synonymous mutations, which may affect viral fitness. This study aims to predict the effect of synonymous mutations on the SARS-CoV-2 genome. Methods A total of 26645 SARS-CoV-2 genomic sequences retrieved from Global Initiative on Sharing all Influenza Data (GISAID) database were aligned using MAFFT. Then, the mutations and their respective frequency were identified. Multiple RNA secondary structures prediction tools, namely RNAfold, IPknot++ and MXfold2 were applied to predict the effect of the mutations on RNA secondary structure and their base pair probabilities was estimated using MutaRNA. Relative synonymous codon usage (RSCU) analysis was also performed to measure the codon usage bias (CUB) of SARS-CoV-2. Results A total of 150 synonymous mutations were identified. The synonymous mutation identified with the highest frequency is C3037U mutation in the nsp3 of ORF1a. Of these top 10 highest frequency synonymous mutations, C913U, C3037U, U16176C and C18877U mutants show pronounced changes between wild type and mutant in all 3 RNA secondary structure prediction tools, suggesting these mutations may have some biological impact on viral fitness. These four mutations show changes in base pair probabilities. All mutations except U16176C change the codon to a more preferred codon, which may result in higher translation efficiency. Conclusion Synonymous mutations in SARS-CoV-2 genome may affect RNA secondary structure, changing base pair probabilities and possibly resulting in a higher translation rate. However, lab experiments are required to validate the results obtained from prediction analysis.
Collapse
Affiliation(s)
- Wan Xin Boon
- Faculty of Information Science and Technology, Multimedia University, Bukit Beruang, Melaka, 75450, Malaysia
| | - Boon Zhan Sia
- Faculty of Information Science and Technology, Multimedia University, Bukit Beruang, Melaka, 75450, Malaysia
| | - Chong Han Ng
- Faculty of Information Science and Technology, Multimedia University, Bukit Beruang, Melaka, 75450, Malaysia
| |
Collapse
|
7
|
Holmes EC. The Emergence and Evolution of SARS-CoV-2. Annu Rev Virol 2024; 11:21-42. [PMID: 38631919 DOI: 10.1146/annurev-virology-093022-013037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
The origin of SARS-CoV-2 has evoked heated debate and strong accusations, yet seemingly little resolution. I review the scientific evidence on the origin of SARS-CoV-2 and its subsequent spread through the human population. The available data clearly point to a natural zoonotic emergence within, or closely linked to, the Huanan Seafood Wholesale Market in Wuhan. There is no direct evidence linking the emergence of SARS-CoV-2 to laboratory work conducted at the Wuhan Institute of Virology. The subsequent global spread of SARS-CoV-2 was characterized by a gradual adaptation to humans, with dual increases in transmissibility and virulence until the emergence of the Omicron variant. Of note has been the frequent transmission of SARS-CoV-2 from humans to other animals, marking it as a strongly host generalist virus. Unless lessons from the origin of SARS-CoV-2 are learned, it is inevitable that more zoonotic events leading to more epidemics and pandemics will plague human populations.
Collapse
Affiliation(s)
- Edward C Holmes
- Sydney Institute for Infectious Diseases, School of Medical Sciences, The University of Sydney, Sydney, New South Wales, Australia;
| |
Collapse
|
8
|
Hong JS, Tindall JM, Tindall SR, Sorscher EJ. Mutation accumulation in H. sapiens F508del CFTR countermands dN/dS type genomic analysis. PLoS One 2024; 19:e0305832. [PMID: 39024311 PMCID: PMC11257350 DOI: 10.1371/journal.pone.0305832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2024] [Accepted: 06/05/2024] [Indexed: 07/20/2024] Open
Abstract
Understanding the mechanisms that underlie de novo mutations (DNMs) can be essential for interpreting human evolution, including aspects such as rapidly diverging genes, conservation of non-coding regulatory elements, and somatic DNA adaptation, among others. DNM accumulation in Homo sapiens is often limited to evaluation of human trios or quads across a single generation. Moreover, human SNPs in exons, pseudogenes, or other non-coding elements can be ancient and difficult to date, including polymorphisms attributable to founder effects and identity by descent. In this report, we describe multigenerational evolution of a human coding locus devoid of natural selection, and delineate patterns and principles by which DNMs have accumulated over the past few thousand years. We apply a data set comprising cystic fibrosis transmembrane conductance regulator (CFTR) alleles from 2,393 individuals homozygous for the F508del defect. Additional polymorphism on the F508del background diversified subsequent to a single mutational event during recent human history. Because F508del CFTR is without function, SNPs observed on this haplotype are effectively attributable to factors that govern accumulating de novo mutations. We show profound enhancement of transition, synonymous, and positionally repetitive polymorphisms, indicating appearance of DNMs in a manner evolutionarily designed to protect protein coding DNA against mutational attrition while promoting diversity.
Collapse
Affiliation(s)
- Jeong S. Hong
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Janice M. Tindall
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Samuel R. Tindall
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| | - Eric J. Sorscher
- Emory University School of Medicine, Atlanta, Georgia, United States of America
| |
Collapse
|
9
|
Tiwary BK. A positive selection at binding site 501 in the B.1 lineage might have triggered the highly infectious sub-lineages of SARS-CoV-2. Gene 2024; 915:148427. [PMID: 38575097 DOI: 10.1016/j.gene.2024.148427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 03/21/2024] [Accepted: 04/01/2024] [Indexed: 04/06/2024]
Abstract
The descendants of the B lineage are the most predominant variants among the SARS-CoV-2 virus due to the incorporation of new mutations augmenting the infectivity of the virus. There is a substantial increase in the transition transversion bias, nucleotide diversity and purifying selection on the spike protein in the descendants of the B lineage of the SARS-CoV-2 virus on a temporal scale. A strong bias for C-to-U substitutions is found in the genes encoding spike protein in this lineage. The positive selection has operated on the spike gene of B lineages and its sub-lineages. The B.1 lineage has undergone positive selection on site 501 of the receptor binding domain ultimately reflected in a key substitution N501Y in its three descendant lineages namely B.1.1.7, B.1.351 and P.1. The intensity of purifying selection on the multiple sites of the spike gene has increased substantially in the sub-lineages of B.1 in a timescale. The binding site 501 on the spike protein in B lineage is found to coevolve with other amino acid sites. This study sheds light on the evolutionary trajectory of the B lineage into highly infectious descendants in the recent past under the influence of positive and purifying selection exerted by natural immunity and vaccination of the host.
Collapse
Affiliation(s)
- Basant K Tiwary
- Department of Bioinformatics, School of Life Sciences, Pondicherry University, Pondicherry 605 014, India.
| |
Collapse
|
10
|
Moss B. Understanding the biology of monkeypox virus to prevent future outbreaks. Nat Microbiol 2024; 9:1408-1416. [PMID: 38724757 DOI: 10.1038/s41564-024-01690-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 03/26/2024] [Indexed: 06/07/2024]
Abstract
Historically, monkeypox (mpox) was a zoonotic disease endemic in Africa. However, in 2022, a global outbreak occurred following a substantial increase in cases in Africa, coupled with spread by international travellers to other continents. Between January 2022 and October 2023, about 91,000 confirmed cases from 115 countries were reported, leading the World Health Organization to declare a public health emergency. The basic biology of monkeypox virus (MPXV) can be inferred from other poxviruses, such as vaccinia virus, and confirmed by genome sequencing. Here the biology of MPXV is reviewed, together with a discussion of adaptive changes during MPXV evolution and implications for transmission. Studying MPXV biology is important to inform specific host interactions, to aid in ongoing outbreaks and to predict those in the future.
Collapse
Affiliation(s)
- Bernard Moss
- Laboratory of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
11
|
Liu Y, Sapoval N, Gallego-García P, Tomás L, Posada D, Treangen TJ, Stadler LB. Crykey: Rapid identification of SARS-CoV-2 cryptic mutations in wastewater. Nat Commun 2024; 15:4545. [PMID: 38806450 PMCID: PMC11133379 DOI: 10.1038/s41467-024-48334-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 04/29/2024] [Indexed: 05/30/2024] Open
Abstract
Wastewater surveillance for SARS-CoV-2 provides early warnings of emerging variants of concerns and can be used to screen for novel cryptic linked-read mutations, which are co-occurring single nucleotide mutations that are rare, or entirely missing, in existing SARS-CoV-2 databases. While previous approaches have focused on specific regions of the SARS-CoV-2 genome, there is a need for computational tools capable of efficiently tracking cryptic mutations across the entire genome and investigating their potential origin. We present Crykey, a tool for rapidly identifying rare linked-read mutations across the genome of SARS-CoV-2. We evaluated the utility of Crykey on over 3,000 wastewater and over 22,000 clinical samples; our findings are three-fold: i) we identify hundreds of cryptic mutations that cover the entire SARS-CoV-2 genome, ii) we track the presence of these cryptic mutations across multiple wastewater treatment plants and over three years of sampling in Houston, and iii) we find a handful of cryptic mutations in wastewater mirror cryptic mutations in clinical samples and investigate their potential to represent real cryptic lineages. In summary, Crykey enables large-scale detection of cryptic mutations in wastewater that represent potential circulating cryptic lineages, serving as a new computational tool for wastewater surveillance of SARS-CoV-2.
Collapse
Affiliation(s)
- Yunxi Liu
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Pilar Gallego-García
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - Laura Tomás
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
| | - David Posada
- CINBIO, Universidade de Vigo, 36310, Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO, Vigo, Spain
- Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310, Vigo, Spain
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, 77005, USA.
| | - Lauren B Stadler
- Department of Civil and Environmental Engineering, Rice University, Houston, TX, 77005, USA.
| |
Collapse
|
12
|
Cahuantzi R, Lythgoe KA, Hall I, Pellis L, House T. Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods. Proc Natl Acad Sci U S A 2024; 121:e2317284121. [PMID: 38478692 PMCID: PMC10962941 DOI: 10.1073/pnas.2317284121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/05/2024] [Indexed: 03/21/2024] Open
Abstract
Since its emergence in late 2019, SARS-CoV-2 has diversified into a large number of lineages and caused multiple waves of infection globally. Novel lineages have the potential to spread rapidly and internationally if they have higher intrinsic transmissibility and/or can evade host immune responses, as has been seen with the Alpha, Delta, and Omicron variants of concern. They can also cause increased mortality and morbidity if they have increased virulence, as was seen for Alpha and Delta. Phylogenetic methods provide the "gold standard" for representing the global diversity of SARS-CoV-2 and to identify newly emerging lineages. However, these methods are computationally expensive, struggle when datasets get too large, and require manual curation to designate new lineages. These challenges provide a motivation to develop complementary methods that can incorporate all of the genetic data available without down-sampling to extract meaningful information rapidly and with minimal curation. In this paper, we demonstrate the utility of using algorithmic approaches based on word-statistics to represent whole sequences, bringing speed, scalability, and interpretability to the construction of genetic topologies. While not serving as a substitute for current phylogenetic analyses, the proposed methods can be used as a complementary, and fully automatable, approach to identify and confirm new emerging variants.
Collapse
Affiliation(s)
- Roberto Cahuantzi
- Department of Mathematics, The University of Manchester, ManchesterM13 9PL, United Kingdom
- United Kingdom Health Security Agency, University of Oxford, OxfordOX3 7LF, United Kingdom
| | - Katrina A. Lythgoe
- Department of Biology, University of Oxford, OxfordOX1 3SZ, United Kingdom
- Big Data Institute, University of Oxford, OxfordOX3 7LF, United Kingdom
- Pandemic Sciences Institute, University of Oxford, OxfordOX3 7LF, United Kingdom
| | - Ian Hall
- Department of Mathematics, The University of Manchester, ManchesterM13 9PL, United Kingdom
| | - Lorenzo Pellis
- Department of Mathematics, The University of Manchester, ManchesterM13 9PL, United Kingdom
| | - Thomas House
- Department of Mathematics, The University of Manchester, ManchesterM13 9PL, United Kingdom
| |
Collapse
|
13
|
Gupta S, Gupta D, Bhatnagar S. Analysis of SARS-CoV-2 genome evolutionary patterns. Microbiol Spectr 2024; 12:e0265423. [PMID: 38197644 PMCID: PMC10846092 DOI: 10.1128/spectrum.02654-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 11/20/2023] [Indexed: 01/11/2024] Open
Abstract
The spread of SARS-CoV-2 virus accompanied by public availability of abundant sequence data provides a window for the determination of viral evolutionary patterns. In this study, SARS-CoV-2 genome sequences were collected from seven countries in the period January 2020-December 2022. The sequences were classified into three phases, namely, pre-vaccination, post-vaccination, and recent period. Comparison was performed between these phases based on parameters like mutation rates, selection pressure (dN/dS ratio), and transition to transversion ratios (Ti/Tv). Similar comparisons were performed among SARS-CoV-2 variants. Statistical significance was tested using Graphpad unpaired t-test. The analysis showed an increase in the percent genomic mutation rates post-vaccination and in recent periods across all countries from the pre-vaccination sequences. Mutation rates were highest in NSP3, S, N, and NSP12b before and increased further after vaccination. NSP4 showed the largest change in mutation rates after vaccination. The dN/dS ratios showed purifying selection that shifted toward neutral selection after vaccination. N, ORF8, ORF3a, and ORF10 were under highest positive selection before vaccination. Shift toward neutral selection was driven by E, NSP3, and ORF7a in the after vaccination set. In recent sequences, the largest dN/dS change was observed in E, NSP1, and NSP13. The Ti/Tv ratios decreased with time. C→U and G→U were the most frequent transitions and transversions. However, U→G was the most frequent transversion in recent period. The Omicron variant had the highest genomic mutation rates, while Delta showed the highest dN/dS ratio. Protein-wise dN/dS ratio was also seen to vary across the different variants.IMPORTANCETo the best of our knowledge, there exists no other large-scale study of the genomic and protein-wise mutation patterns during the time course of evolution in different countries. Analyzing the SARS-CoV-2 evolutionary patterns in view of the varying spatial, temporal, and biological signals is important for diagnostics, therapeutics, and pharmacovigilance of SARS-CoV-2.
Collapse
Affiliation(s)
- Shubhangi Gupta
- Department of Biological Sciences and Engineering, Computational and Structural Biology Laboratory, Netaji Subhas University of Technology, Dwarka, New Delhi, India
| | - Deepanshu Gupta
- Division of Biotechnology, Computational and Structural Biology Laboratory, Netaji Subhas Institute of Technology, Dwarka, New Delhi, India
| | - Sonika Bhatnagar
- Department of Biological Sciences and Engineering, Computational and Structural Biology Laboratory, Netaji Subhas University of Technology, Dwarka, New Delhi, India
- Division of Biotechnology, Computational and Structural Biology Laboratory, Netaji Subhas Institute of Technology, Dwarka, New Delhi, India
| |
Collapse
|
14
|
Dubey S, Verma DK, Kumar M. Severe acute respiratory syndrome Coronavirus-2 GenoAnalyzer and mutagenic anomaly detector using FCMFI and NSCE. Int J Biol Macromol 2024; 258:129051. [PMID: 38159703 DOI: 10.1016/j.ijbiomac.2023.129051] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 11/08/2023] [Accepted: 12/24/2023] [Indexed: 01/03/2024]
Abstract
In order to deepen our understanding of the virus and help guide the creation of efficient therapies, this study uses artificial intelligence tools to thoroughly explore the genetic sequences of the SARS-CoV-2 virus. The process starts by using the Fuzzy Closure Miner for Frequent Itemsets (FCMFI) on a large corpus of SARS-CoV-2 genomic sequences to reveal hidden patterns, including nucleotides base sequences, repeating motifs, and corresponding interchanges. Then, using the Nucleotide Sequence Comprehension Engine (NSCE) technique, we were able to precisely define the genomic areas for mutation analysis. Structured and unstructured proteins are both strongly impacted by virus mutations, with spike proteins that are linked to the severity of COVID-19 pneumonia being particularly affected. Notably, the Mutagenic Anomaly Detector shows a 65 % efficiency boost in computing genome mutation rates compared to conventional point mutation analysis, while GenoAnalyzer offers a remarkable 93.33 % improvement over existing approaches in recognizing common genomic sequence patterns. These results highlight the potential of FCMFI to reveal complex genomic patterns and significant insights in COVID-19 genetic sequences when combined with mutation analysis. The Mutagenic Anomaly Detector and GenoAnalyzer show promise for revealing hidden genomic patterns and precisely estimating the SARS-CoV-2 mutation rate.
Collapse
Affiliation(s)
- Shivendra Dubey
- Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, Madhya Pradesh Pin-473226, India.
| | - Dinesh Kumar Verma
- Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, Madhya Pradesh Pin-473226, India.
| | - Mahesh Kumar
- Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, Madhya Pradesh Pin-473226, India.
| |
Collapse
|
15
|
Zech F, Jung C, Jacob T, Kirchhoff F. Causes and Consequences of Coronavirus Spike Protein Variability. Viruses 2024; 16:177. [PMID: 38399953 PMCID: PMC10892391 DOI: 10.3390/v16020177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 01/20/2024] [Accepted: 01/23/2024] [Indexed: 02/25/2024] Open
Abstract
Coronaviruses are a large family of enveloped RNA viruses found in numerous animal species. They are well known for their ability to cross species barriers and have been transmitted from bats or intermediate hosts to humans on several occasions. Four of the seven human coronaviruses (hCoVs) are responsible for approximately 20% of common colds (hCoV-229E, -NL63, -OC43, -HKU1). Two others (SARS-CoV-1 and MERS-CoV) cause severe and frequently lethal respiratory syndromes but have only spread to very limited extents in the human population. In contrast the most recent human hCoV, SARS-CoV-2, while exhibiting intermediate pathogenicity, has a profound impact on public health due to its enormous spread. In this review, we discuss which initial features of the SARS-CoV-2 Spike protein and subsequent adaptations to the new human host may have helped this pathogen to cause the COVID-19 pandemic. Our focus is on host forces driving changes in the Spike protein and their consequences for virus infectivity, pathogenicity, immune evasion and resistance to preventive or therapeutic agents. In addition, we briefly address the significance and perspectives of broad-spectrum therapeutics and vaccines.
Collapse
Affiliation(s)
- Fabian Zech
- Institute of Molecular Virology, Ulm University Medical Center, 89081 Ulm, Germany
| | - Christoph Jung
- Institute of Electrochemistry, Ulm University, 89081 Ulm, Germany; (C.J.); (T.J.)
- Helmholtz-Institute Ulm (HIU) Electrochemical Energy Storage, 89081 Ulm, Germany
- Karlsruhe Institute of Technology (KIT), 76021 Karlsruhe, Germany
| | - Timo Jacob
- Institute of Electrochemistry, Ulm University, 89081 Ulm, Germany; (C.J.); (T.J.)
- Helmholtz-Institute Ulm (HIU) Electrochemical Energy Storage, 89081 Ulm, Germany
- Karlsruhe Institute of Technology (KIT), 76021 Karlsruhe, Germany
| | - Frank Kirchhoff
- Institute of Molecular Virology, Ulm University Medical Center, 89081 Ulm, Germany
| |
Collapse
|
16
|
Chatterjee S, Zaia J. Proteomics-based mass spectrometry profiling of SARS-CoV-2 infection from human nasopharyngeal samples. MASS SPECTROMETRY REVIEWS 2024; 43:193-229. [PMID: 36177493 PMCID: PMC9538640 DOI: 10.1002/mas.21813] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/07/2022] [Accepted: 09/09/2022] [Indexed: 05/12/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the on-going global pandemic of coronavirus disease 2019 (COVID-19) that continues to pose a significant threat to public health worldwide. SARS-CoV-2 encodes four structural proteins namely membrane, nucleocapsid, spike, and envelope proteins that play essential roles in viral entry, fusion, and attachment to the host cell. Extensively glycosylated spike protein efficiently binds to the host angiotensin-converting enzyme 2 initiating viral entry and pathogenesis. Reverse transcriptase polymerase chain reaction on nasopharyngeal swab is the preferred method of sample collection and viral detection because it is a rapid, specific, and high-throughput technique. Alternate strategies such as proteomics and glycoproteomics-based mass spectrometry enable a more detailed and holistic view of the viral proteins and host-pathogen interactions and help in detection of potential disease markers. In this review, we highlight the use of mass spectrometry methods to profile the SARS-CoV-2 proteome from clinical nasopharyngeal swab samples. We also highlight the necessity for a comprehensive glycoproteomics mapping of SARS-CoV-2 from biological complex matrices to identify potential COVID-19 markers.
Collapse
Affiliation(s)
- Sayantani Chatterjee
- Department of Biochemistry, Center for Biomedical Mass SpectrometryBoston University School of MedicineBostonMassachusettsUSA
| | - Joseph Zaia
- Department of Biochemistry, Center for Biomedical Mass SpectrometryBoston University School of MedicineBostonMassachusettsUSA
- Bioinformatics ProgramBoston University School of MedicineBostonMassachusettsUSA
| |
Collapse
|
17
|
Liu Y, Sapoval N, Gallego-García P, Tomás L, Posada D, Treangen TJ, Stadler LB. Crykey: Rapid Identification of SARS-CoV-2 Cryptic Mutations in Wastewater. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.16.23291524. [PMID: 37986916 PMCID: PMC10659477 DOI: 10.1101/2023.06.16.23291524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
We present Crykey, a computational tool for rapidly identifying cryptic mutations of SARS-CoV-2. Specifically, we identify co-occurring single nucleotide mutations on the same sequencing read, called linked-read mutations, that are rare or entirely missing in existing databases, and have the potential to represent novel cryptic lineages found in wastewater. While previous approaches exist for identifying cryptic linked-read mutations from specific regions of the SARS-CoV-2 genome, there is a need for computational tools capable of efficiently tracking cryptic mutations across the entire genome and for tens of thousands of samples and with increased scrutiny, given their potential to represent either artifacts or hidden SARS-CoV-2 lineages. Crykey fills this gap by identifying rare linked-read mutations that pass stringent computational filters to limit the potential for artifacts. We evaluate the utility of Crykey on >3,000 wastewater and >22,000 clinical samples; our findings are three-fold: i) we identify hundreds of cryptic mutations that cover the entire SARS-CoV-2 genome, ii) we track the presence of these cryptic mutations across multiple wastewater treatment plants and over a three years of sampling in Houston, and iii) we find a handful of cryptic mutations in wastewater mirror cryptic mutations in clinical samples and investigate their potential to represent real cryptic lineages. In summary, Crykey enables large-scale detection of cryptic mutations representing potential cryptic lineages in wastewater.
Collapse
Affiliation(s)
- Yunxi Liu
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Pilar Gallego-García
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
| | - Laura Tomás
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
| | - David Posada
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain
- Galicia Sur Health Research Institute (IIS Galicia Sur), SERGAS-UVIGO
- Department of Biochemistry, Genetics, and Immunology, Universidade de Vigo, 36310 Vigo, Spain
| | - Todd J. Treangen
- Department of Computer Science, Rice University, Houston, TX, 77005, USA
| | - Lauren B. Stadler
- Department of Civil and Environmental Engineering, Rice University, Houston, TX, 77005, USA
| |
Collapse
|
18
|
Kramer AM, Thornlow B, Ye C, De Maio N, McBroome J, Hinrichs AS, Lanfear R, Turakhia Y, Corbett-Detig R. Online Phylogenetics with matOptimize Produces Equivalent Trees and is Dramatically More Efficient for Large SARS-CoV-2 Phylogenies than de novo and Maximum-Likelihood Implementations. Syst Biol 2023; 72:1039-1051. [PMID: 37232476 PMCID: PMC10627557 DOI: 10.1093/sysbio/syad031] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 05/14/2023] [Accepted: 06/22/2023] [Indexed: 05/27/2023] Open
Abstract
Phylogenetics has been foundational to SARS-CoV-2 research and public health policy, assisting in genomic surveillance, contact tracing, and assessing emergence and spread of new variants. However, phylogenetic analyses of SARS-CoV-2 have often relied on tools designed for de novo phylogenetic inference, in which all data are collected before any analysis is performed and the phylogeny is inferred once from scratch. SARS-CoV-2 data sets do not fit this mold. There are currently over 14 million sequenced SARS-CoV-2 genomes in online databases, with tens of thousands of new genomes added every day. Continuous data collection, combined with the public health relevance of SARS-CoV-2, invites an "online" approach to phylogenetics, in which new samples are added to existing phylogenetic trees every day. The extremely dense sampling of SARS-CoV-2 genomes also invites a comparison between likelihood and parsimony approaches to phylogenetic inference. Maximum likelihood (ML) and pseudo-ML methods may be more accurate when there are multiple changes at a single site on a single branch, but this accuracy comes at a large computational cost, and the dense sampling of SARS-CoV-2 genomes means that these instances will be extremely rare because each internal branch is expected to be extremely short. Therefore, it may be that approaches based on maximum parsimony (MP) are sufficiently accurate for reconstructing phylogenies of SARS-CoV-2, and their simplicity means that they can be applied to much larger data sets. Here, we evaluate the performance of de novo and online phylogenetic approaches, as well as ML, pseudo-ML, and MP frameworks for inferring large and dense SARS-CoV-2 phylogenies. Overall, we find that online phylogenetics produces similar phylogenetic trees to de novo analyses for SARS-CoV-2, and that MP optimization with UShER and matOptimize produces equivalent SARS-CoV-2 phylogenies to some of the most popular ML and pseudo-ML inference tools. MP optimization with UShER and matOptimize is thousands of times faster than presently available implementations of ML and online phylogenetics is faster than de novo inference. Our results therefore suggest that parsimony-based methods like UShER and matOptimize represent an accurate and more practical alternative to established ML implementations for large SARS-CoV-2 phylogenies and could be successfully applied to other similar data sets with particularly dense sampling and short branch lengths.
Collapse
Affiliation(s)
- Alexander M Kramer
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Cheng Ye
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Jakob McBroome
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA 92093, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
19
|
Zhang H, Lundberg M, Tarka M, Hasselquist D, Hansson B. Evidence of Site-Specific and Male-Biased Germline Mutation Rate in a Wild Songbird. Genome Biol Evol 2023; 15:evad180. [PMID: 37793164 PMCID: PMC10627410 DOI: 10.1093/gbe/evad180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 09/07/2023] [Accepted: 09/26/2023] [Indexed: 10/06/2023] Open
Abstract
Germline mutations are the ultimate source of genetic variation and the raw material for organismal evolution. Despite their significance, the frequency and genomic locations of mutations, as well as potential sex bias, are yet to be widely investigated in most species. To address these gaps, we conducted whole-genome sequencing of 12 great reed warblers (Acrocephalus arundinaceus) in a pedigree spanning 3 generations to identify single-nucleotide de novo mutations (DNMs) and estimate the germline mutation rate. We detected 82 DNMs within the pedigree, primarily enriched at CpG sites but otherwise randomly located along the chromosomes. Furthermore, we observed a pronounced sex bias in DNM occurrence, with male warblers exhibiting three times more mutations than females. After correction for false negatives and adjusting for callable sites, we obtained a mutation rate of 7.16 × 10-9 mutations per site per generation (m/s/g) for the autosomes and 5.10 × 10-9 m/s/g for the Z chromosome. To demonstrate the utility of species-specific mutation rates, we applied our autosomal mutation rate in models reconstructing the demographic history of the great reed warbler. We uncovered signs of drastic population size reductions predating the last glacial period (LGP) and reduced gene flow between western and eastern populations during the LGP. In conclusion, our results provide one of the few direct estimates of the mutation rate in wild songbirds and evidence for male-driven mutations in accordance with theoretical expectations.
Collapse
Affiliation(s)
- Hongkai Zhang
- Department of Biology, Lund University, Lund, Sweden
| | - Max Lundberg
- Department of Biology, Lund University, Lund, Sweden
| | - Maja Tarka
- Department of Biology, Lund University, Lund, Sweden
| | | | - Bengt Hansson
- Department of Biology, Lund University, Lund, Sweden
| |
Collapse
|
20
|
Rad SMAH, Wannigama DL, Hirankarn N, McLellan AD. The impact of non-synonymous mutations on miRNA binding sites within the SARS-CoV-2 NSP3 and NSP4 genes. Sci Rep 2023; 13:16945. [PMID: 37805621 PMCID: PMC10560223 DOI: 10.1038/s41598-023-44219-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 10/05/2023] [Indexed: 10/09/2023] Open
Abstract
Non-synonymous mutations in the SARS-CoV-2 spike region affect cell entry, tropism, and immune evasion, while frequent synonymous mutations may modify viral fitness. Host microRNAs, a type of non-coding RNA, play a crucial role in the viral life cycle, influencing viral replication and the host immune response directly or indirectly. Recently, we identified ten miRNAs with a high complementary capacity to target various regions of the SARS-CoV-2 genome. We filtered our candidate miRNAs to those only expressed with documented expression in SARS-CoV-2 target cells, with an additional focus on miRNAs that have been reported in other viral infections. We determined if mutations in the first SARS-CoV-2 variants of concern affected these miRNA binding sites. Out of ten miRNA binding sites, five were negatively impacted by mutations, with three recurrent synonymous mutations present in multiple SARS-CoV-2 lineages with high-frequency NSP3: C3037U and NSP4: G9802U/C9803U. These mutations were predicted to negatively affect the binding ability of miR-197-5p and miR-18b-5p, respectively. In these preliminary findings, using a dual-reporter assay system, we confirmed the ability of these miRNAs in binding to the predicted NSP3 and NSP4 regions and the loss/reduced miRNA bindings due to the recurrent mutations.
Collapse
Affiliation(s)
- S M Ali Hosseini Rad
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.
- Center of Excellence in Immunology and Immune-Mediated Diseases, Chulalongkorn University, Bangkok, Thailand.
- Department of Microbiology, Faculty of Medicine, Chulalongkorn University, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand.
| | - Dhammika Leshan Wannigama
- Department of Microbiology, Faculty of Medicine, Chulalongkorn University, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand
- Department of Infectious Diseases and Infection Control, Yamagata Prefectural Central Hospital, Yamagata, Japan
- Center of Excellence in Antimicrobial Resistance and Stewardship Research, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand
- School of Medicine, Faculty of Health and Medical Sciences, The University of Western Australia, Nedlands, WA, Australia
- Biofilms and Antimicrobial Resistance Consortium of ODA Receiving Countries, The University of Sheffield, Sheffield, UK
- Pathogen Hunter's Research Team, Department of Infectious Diseases and Infection Control, Yamagata Prefectural Central Hospital, Yamagata, Japan
- Yamagata Prefectural University of Health Sciences, Kamiyanagi, Yamagata, 990-2212, Japan
| | - Nattiya Hirankarn
- Center of Excellence in Immunology and Immune-Mediated Diseases, Chulalongkorn University, Bangkok, Thailand.
- Department of Microbiology, Faculty of Medicine, Chulalongkorn University, King Chulalongkorn Memorial Hospital, Thai Red Cross Society, Bangkok, Thailand.
| | - Alexander D McLellan
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand.
| |
Collapse
|
21
|
Bloom JD, Neher RA. Fitness effects of mutations to SARS-CoV-2 proteins. Virus Evol 2023; 9:vead055. [PMID: 37727875 PMCID: PMC10506532 DOI: 10.1093/ve/vead055] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 08/08/2023] [Accepted: 08/22/2023] [Indexed: 09/21/2023] Open
Abstract
Knowledge of the fitness effects of mutations to SARS-CoV-2 can inform assessment of new variants, design of therapeutics resistant to escape, and understanding of the functions of viral proteins. However, experimentally measuring effects of mutations is challenging: we lack tractable lab assays for many SARS-CoV-2 proteins, and comprehensive deep mutational scanning has been applied to only two SARS-CoV-2 proteins. Here, we develop an approach that leverages millions of publicly available SARS-CoV-2 sequences to estimate effects of mutations. We first calculate how many independent occurrences of each mutation are expected to be observed along the SARS-CoV-2 phylogeny in the absence of selection. We then compare these expected observations to the actual observations to estimate the effect of each mutation. These estimates correlate well with deep mutational scanning measurements. For most genes, synonymous mutations are nearly neutral, stop-codon mutations are deleterious, and amino acid mutations have a range of effects. However, some viral accessory proteins are under little to no selection. We provide interactive visualizations of effects of mutations to all SARS-CoV-2 proteins (https://jbloomlab.github.io/SARS2-mut-fitness/). The framework we describe is applicable to any virus for which the number of available sequences is sufficiently large that many independent occurrences of each neutral mutation are observed.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences and Computational Biology, Fred Hutchinson Cancer Center, 1100 Fairview Ave N, Seattle, WA 98109, USA
- Department of Genome Sciences, University of Washington, 3720 15th Ave NE, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, 1100 Fairview Ave N, Seattle, WA 98109, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Spitalstrasse 41, Basel 4056, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerl
| |
Collapse
|
22
|
Xi B, Zeng X, Chen Z, Zeng J, Huang L, Du H. SARS-CoV-2 within-host diversity of human hosts and its implications for viral immune evasion. mBio 2023; 14:e0067923. [PMID: 37273216 PMCID: PMC10470530 DOI: 10.1128/mbio.00679-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 04/17/2023] [Indexed: 06/06/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is continuously evolving, bringing great challenges to the control of the virus. In the present study, we investigated the characteristics of SARS-CoV-2 within-host diversity of human hosts and its implications for immune evasion using about 2,00,000 high-depth next-generation genome sequencing data of SARS-CoV-2. A total of 44% of the samples showed within-host variations (iSNVs), and the average number of iSNVs in the samples with iSNV was 1.90. C-to-U is the dominant substitution pattern for iSNVs. C-to-U/G-to-A and A-to-G/U-to-C preferentially occur in 5'-CG-3' and 5'-AU-3' motifs, respectively. In addition, we found that SARS-CoV-2 within-host variations are under negative selection. About 15.6% iSNVs had an impact on the content of the CpG dinucleotide (CpG) in SARS-CoV-2 genomes. We detected signatures of faster loss of CpG-gaining iSNVs, possibly resulting from zinc-finger antiviral protein-mediated antiviral activities targeting CpG, which could be the major reason for CpG depletion in SARS-CoV-2 consensus genomes. The non-synonymous iSNVs in the S gene can largely alter the S protein's antigenic features, and many of these iSNVs are distributed in the amino-terminal domain (NTD) and receptor-binding domain (RBD). These results suggest that SARS-CoV-2 interacts actively with human hosts and attempts to take different evolutionary strategies to escape human innate and adaptive immunity. These new findings further deepen and widen our understanding of the within-host evolutionary features of SARS-CoV-2. IMPORTANCE Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative pathogen of the coronavirus disease 2019, has evolved rapidly since it was discovered. Recent studies have pointed out that some mutations in the SARS-CoV-2 S protein could confer SARS-CoV-2 the ability to evade the human adaptive immune system. In addition, it is observed that the content of the CpG dinucleotide in SARS-CoV-2 genome sequences has decreased over time, reflecting the adaptation to the human host. The significance of our research is revealing the characteristics of SARS-CoV-2 within-host diversity of human hosts, identifying the causes of CpG depletion in SARS-CoV-2 consensus genomes, and exploring the potential impacts of non-synonymous within-host variations in the S gene on immune escape, which could further deepen and widen our understanding of the evolutionary features of SARS-CoV-2.
Collapse
Affiliation(s)
- Binbin Xi
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Xi Zeng
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Zixi Chen
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Jiong Zeng
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Lizhen Huang
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| | - Hongli Du
- School of Biology and Biological Engineering, South China University of Technology, Guangzhou, China
| |
Collapse
|
23
|
Wu X, Shan K, Zan F, Tang X, Qian Z, Lu J. Optimization and Deoptimization of Codons in SARS-CoV-2 and Related Implications for Vaccine Development. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2205445. [PMID: 37267926 PMCID: PMC10427376 DOI: 10.1002/advs.202205445] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 04/08/2023] [Indexed: 06/04/2023]
Abstract
The spread of coronavirus disease 2019 (COVID-19), caused by severe respiratory syndrome coronavirus 2 (SARS-CoV-2), has progressed into a global pandemic. To date, thousands of genetic variants have been identified among SARS-CoV-2 isolates collected from patients. Sequence analysis reveals that the codon adaptation index (CAI) values of viral sequences have decreased over time but with occasional fluctuations. Through evolution modeling, it is found that this phenomenon may result from the virus's mutation preference during transmission. Using dual-luciferase assays, it is further discovered that the deoptimization of codons in the viral sequence may weaken protein expression during virus evolution, indicating that codon usage may play an important role in virus fitness. Finally, given the importance of codon usage in protein expression and particularly for mRNA vaccines, it is designed several codon-optimized Omicron BA.2.12.1, BA.4/5, and XBB.1.5 spike mRNA vaccine candidates and experimentally validated their high levels of expression. This study highlights the importance of codon usage in virus evolution and provides guidelines for codon optimization in mRNA and DNA vaccine development.
Collapse
Affiliation(s)
- Xinkai Wu
- State Key Laboratory of Protein and Plant Gene ResearchCenter for BioinformaticsSchool of Life SciencesPeking UniversityBeijing100871China
| | - Ke‐jia Shan
- State Key Laboratory of Protein and Plant Gene ResearchCenter for BioinformaticsSchool of Life SciencesPeking UniversityBeijing100871China
| | - Fuwen Zan
- NHC Key Laboratory of Systems Biology of PathogensInstitute of Pathogen BiologyChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100176China
| | - Xiaolu Tang
- State Key Laboratory of Protein and Plant Gene ResearchCenter for BioinformaticsSchool of Life SciencesPeking UniversityBeijing100871China
| | - Zhaohui Qian
- NHC Key Laboratory of Systems Biology of PathogensInstitute of Pathogen BiologyChinese Academy of Medical Sciences and Peking Union Medical CollegeBeijing100176China
| | - Jian Lu
- State Key Laboratory of Protein and Plant Gene ResearchCenter for BioinformaticsSchool of Life SciencesPeking UniversityBeijing100871China
| |
Collapse
|
24
|
Goiriz L, Ruiz R, Garibo-i-Orts Ò, Conejero JA, Rodrigo G. A variant-dependent molecular clock with anomalous diffusion models SARS-CoV-2 evolution in humans. Proc Natl Acad Sci U S A 2023; 120:e2303578120. [PMID: 37459528 PMCID: PMC10372551 DOI: 10.1073/pnas.2303578120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 06/11/2023] [Indexed: 07/20/2023] Open
Abstract
The evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in humans has been monitored at an unprecedented level due to the public health crisis, yet the stochastic dynamics underlying such a process is dubious. Here, considering the number of acquired mutations as the displacement of the viral particle from the origin, we performed biostatistical analyses from numerous whole genome sequences on the basis of a time-dependent probabilistic mathematical model. We showed that a model with a constant variant-dependent evolution rate and nonlinear mutational variance with time (i.e., anomalous diffusion) explained the SARS-CoV-2 evolutionary motion in humans during the first 120 wk of the pandemic in the United Kingdom. In particular, we found subdiffusion patterns for the Primal, Alpha, and Omicron variants but a weak superdiffusion pattern for the Delta variant. Our findings indicate that non-Brownian evolutionary motions occur in nature, thereby providing insight for viral phylodynamics.
Collapse
Affiliation(s)
- Lucas Goiriz
- BioInstituto de Biología Integrativa de Sistemas, Consejo Superior de Investigaciones Científicas – Universitat de València, 46980Paterna, Spain
- Institut Universitari de Matemàtica Pura i Aplicada, Universitat Politècnica de València, 46022Valencia, Spain
| | - Raúl Ruiz
- BioInstituto de Biología Integrativa de Sistemas, Consejo Superior de Investigaciones Científicas – Universitat de València, 46980Paterna, Spain
| | - Òscar Garibo-i-Orts
- Institut Universitari de Matemàtica Pura i Aplicada, Universitat Politècnica de València, 46022Valencia, Spain
| | - J. Alberto Conejero
- Institut Universitari de Matemàtica Pura i Aplicada, Universitat Politècnica de València, 46022Valencia, Spain
| | - Guillermo Rodrigo
- BioInstituto de Biología Integrativa de Sistemas, Consejo Superior de Investigaciones Científicas – Universitat de València, 46980Paterna, Spain
| |
Collapse
|
25
|
Masone D, Soledad Alvarez M, Polo LM. The SARS-CoV-2 mutation landscape is shaped before replication starts. Genet Mol Biol 2023; 46:e20230005. [PMID: 37338301 DOI: 10.1590/1678-4685-gmb-2023-0005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 05/05/2023] [Indexed: 06/21/2023] Open
Abstract
Mutation landscapes and signatures have been thoroughly studied in SARS-CoV-2. Here, we analyse those patterns and link their changes to the viral replication tissue in the respiratory tract. Surprisingly, a substantial difference in those patterns is observed in samples from vaccinated patients. Hence, we propose a model to explain where those mutations could originate during the replication cycle.
Collapse
Affiliation(s)
- Diego Masone
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Cuyo (UNCuyo), Instituto de Histología y Embriología de Mendoza (IHEM), Mendoza, Argentina
- Universidad Nacional de Cuyo (UNCuyo), Facultad de Ingeniería, Mendoza, Argentina
| | - Maria Soledad Alvarez
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Cuyo (UNCuyo), Instituto de Medicina y Biología Experimental de Cuyo (IMBECU), Mendoza, Argentina
| | - Luis Mariano Polo
- Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Cuyo (UNCuyo), Instituto de Histología y Embriología de Mendoza (IHEM), Mendoza, Argentina
| |
Collapse
|
26
|
Bloom JD, Neher RA. Fitness effects of mutations to SARS-CoV-2 proteins. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.30.526314. [PMID: 36778462 PMCID: PMC9915511 DOI: 10.1101/2023.01.30.526314] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Knowledge of the fitness effects of mutations to SARS-CoV-2 can inform assessment of new variants, design of therapeutics resistant to escape, and understanding of the functions of viral proteins. However, experimentally measuring effects of mutations is challenging: we lack tractable lab assays for many SARS-CoV-2 proteins, and comprehensive deep mutational scanning has been applied to only two SARS-CoV-2 proteins. Here we develop an approach that leverages millions of publicly available SARS-CoV-2 sequences to estimate effects of mutations. We first calculate how many independent occurrences of each mutation are expected to be observed along the SARS-CoV-2 phylogeny in the absence of selection. We then compare these expected observations to the actual observations to estimate the effect of each mutation. These estimates correlate well with deep mutational scanning measurements. For most genes, synonymous mutations are nearly neutral, stop-codon mutations are deleterious, and amino-acid mutations have a range of effects. However, some viral accessory proteins are under little to no selection. We provide interactive visualizations of effects of mutations to all SARS-CoV-2 proteins (https://jbloomlab.github.io/SARS2-mut-fitness/). The framework we describe is applicable to any virus for which the number of available sequences is sufficiently large that many independent occurrences of each neutral mutation are observed.
Collapse
Affiliation(s)
- Jesse D. Bloom
- Basic Sciences and Computational Biology, Fred Hutchinson Cancer Center
- Department of Genome Sciences, University of Washington
- Howard Hughes Medical Institute
| | - Richard A. Neher
- Biozentrum, University of Basel
- Swiss Institute of Bioinformatics
| |
Collapse
|
27
|
Saldivar-Espinoza B, Garcia-Segura P, Novau-Ferré N, Macip G, Martínez R, Puigbò P, Cereto-Massagué A, Pujadas G, Garcia-Vallve S. The Mutational Landscape of SARS-CoV-2. Int J Mol Sci 2023; 24:ijms24109072. [PMID: 37240420 DOI: 10.3390/ijms24109072] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/12/2023] [Accepted: 05/16/2023] [Indexed: 05/28/2023] Open
Abstract
Mutation research is crucial for detecting and treating SARS-CoV-2 and developing vaccines. Using over 5,300,000 sequences from SARS-CoV-2 genomes and custom Python programs, we analyzed the mutational landscape of SARS-CoV-2. Although almost every nucleotide in the SARS-CoV-2 genome has mutated at some time, the substantial differences in the frequency and regularity of mutations warrant further examination. C>U mutations are the most common. They are found in the largest number of variants, pangolin lineages, and countries, which indicates that they are a driving force behind the evolution of SARS-CoV-2. Not all SARS-CoV-2 genes have mutated in the same way. Fewer non-synonymous single nucleotide variations are found in genes that encode proteins with a critical role in virus replication than in genes with ancillary roles. Some genes, such as spike (S) and nucleocapsid (N), show more non-synonymous mutations than others. Although the prevalence of mutations in the target regions of COVID-19 diagnostic RT-qPCR tests is generally low, in some cases, such as for some primers that bind to the N gene, it is significant. Therefore, ongoing monitoring of SARS-CoV-2 mutations is crucial. The SARS-CoV-2 Mutation Portal provides access to a database of SARS-CoV-2 mutations.
Collapse
Affiliation(s)
- Bryan Saldivar-Espinoza
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Pol Garcia-Segura
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Nil Novau-Ferré
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Guillem Macip
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | | | - Pere Puigbò
- Department of Biology, University of Turku, 20500 Turku, Finland
- Department of Biochemistry and Biotechnology, Rovira i Virgili University, 43007 Tarragona, Spain
- Eurecat, Technology Centre of Catalonia, Unit of Nutrition and Health, 43204 Reus, Spain
| | - Adrià Cereto-Massagué
- EURECAT Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Gerard Pujadas
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Santiago Garcia-Vallve
- Departament de Bioquímica i Biotecnologia, Research Group in Cheminformatics & Nutrition, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| |
Collapse
|
28
|
Paulino-Ramírez R, López P, Mueses S, Cuevas P, Jabier M, Rivera-Amill V. Genomic Surveillance of SARS-CoV-2 Variants in the Dominican Republic and Emergence of a Local Lineage. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2023; 20:ijerph20085503. [PMID: 37107785 PMCID: PMC10138544 DOI: 10.3390/ijerph20085503] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 02/24/2023] [Accepted: 04/03/2023] [Indexed: 05/11/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is an RNA virus that evolves over time, leading to new variants. In the current study, we assessed the genomic epidemiology of SARS-CoV-2 in the Dominican Republic. A total of 1149 SARS-CoV-2 complete genome nucleotide sequences from samples collected between March 2020 and mid-February 2022 in the Dominican Republic were obtained from the Global Initiative on Sharing All Influenza Data (GISAID) database. Phylogenetic relationships and evolution rates were analyzed using the maximum likelihood method and the Bayesian Markov chain Monte Carlo (MCMC) approach. The genotyping details (lineages) were obtained using the Pangolin web application. In addition, the web tools Coronapp, and Genome Detective Viral Tools, among others, were used to monitor epidemiological characteristics. Our results show that the most frequent non-synonymous mutation over the study period was D614G. Of the 1149 samples, 870 (75.74%) were classified into 8 relevant variants according to Pangolin/Scorpio. The first Variants Being Monitored (VBM) were detected in December 2020. Meanwhile, in 2021, the variants of concern Delta and Omicron were identified. The mean mutation rate was estimated to be 1.5523 × 10-3 (95% HPD: 1.2358 × 10-3, 1.8635 × 10-3) nucleotide substitutions per site. We also report the emergence of an autochthonous SARS-CoV-2 lineage, B.1.575.2, that circulated from October 2021 to January 2022, in co-circulation with the variants of concern Delta and Omicron. The impact of B.1.575.2 in the Dominican Republic was minimal, but it then expanded rapidly in Spain. A better understanding of viral evolution and genomic surveillance data will help to inform strategies to mitigate the impact on public health.
Collapse
Affiliation(s)
- Robert Paulino-Ramírez
- Instituto de Medicina Tropical y Salud Global, Universidad Iberoamericana, Research Hub, Santo Domingo 22333, Dominican Republic
- Correspondence:
| | - Pablo López
- RCMI Center for Research Resources, Ponce Research Institute, Ponce, PR 00716-2348, USA (V.R.-A.)
| | - Sayira Mueses
- Instituto de Medicina Tropical y Salud Global, Universidad Iberoamericana, Research Hub, Santo Domingo 22333, Dominican Republic
| | - Paula Cuevas
- Instituto de Medicina Tropical y Salud Global, Universidad Iberoamericana, Research Hub, Santo Domingo 22333, Dominican Republic
| | - Maridania Jabier
- Instituto de Medicina Tropical y Salud Global, Universidad Iberoamericana, Research Hub, Santo Domingo 22333, Dominican Republic
- Servicio Nacional de Salud (SNS), Ministry of Health, Santo Domingo 10201, Dominican Republic
| | - Vanessa Rivera-Amill
- RCMI Center for Research Resources, Ponce Research Institute, Ponce, PR 00716-2348, USA (V.R.-A.)
- Basic Sciences Department, School of Medicine, Ponce Health Sciences University, Ponce, PR 00716-2348, USA
| |
Collapse
|
29
|
De Maio N, Kalaghatgi P, Turakhia Y, Corbett-Detig R, Minh BQ, Goldman N. Maximum likelihood pandemic-scale phylogenetics. Nat Genet 2023; 55:746-752. [PMID: 37038003 PMCID: PMC10181937 DOI: 10.1038/s41588-023-01368-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 03/07/2023] [Indexed: 04/12/2023]
Abstract
Phylogenetics has a crucial role in genomic epidemiology. Enabled by unparalleled volumes of genome sequence data generated to study and help contain the COVID-19 pandemic, phylogenetic analyses of SARS-CoV-2 genomes have shed light on the virus's origins, spread, and the emergence and reproductive success of new variants. However, most phylogenetic approaches, including maximum likelihood and Bayesian methods, cannot scale to the size of the datasets from the current pandemic. We present 'MAximum Parsimonious Likelihood Estimation' (MAPLE), an approach for likelihood-based phylogenetic analysis of epidemiological genomic datasets at unprecedented scales. MAPLE infers SARS-CoV-2 phylogenies more accurately than existing maximum likelihood approaches while running up to thousands of times faster, and requiring at least 100 times less memory on large datasets. This extends the reach of genomic epidemiology, allowing the continued use of accurate phylogenetic, phylogeographic and phylodynamic analyses on datasets of millions of genomes.
Collapse
Affiliation(s)
- Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK.
| | | | - Yatish Turakhia
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, CA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Bui Quang Minh
- School of Computing, College of Engineering, Computing and Cybernetics, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| |
Collapse
|
30
|
Bloom JD, Beichman AC, Neher RA, Harris K. Evolution of the SARS-CoV-2 Mutational Spectrum. Mol Biol Evol 2023; 40:msad085. [PMID: 37039557 PMCID: PMC10124870 DOI: 10.1093/molbev/msad085] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Revised: 02/07/2023] [Accepted: 04/06/2023] [Indexed: 04/12/2023] Open
Abstract
SARS-CoV-2 evolves rapidly in part because of its high mutation rate. Here, we examine whether this mutational process itself has changed during viral evolution. To do this, we quantify the relative rates of different types of single-nucleotide mutations at 4-fold degenerate sites in the viral genome across millions of human SARS-CoV-2 sequences. We find clear shifts in the relative rates of several types of mutations during SARS-CoV-2 evolution. The most striking trend is a roughly 2-fold decrease in the relative rate of G→T mutations in Omicron versus early clades, as was recently noted by Ruis et al. (2022. Mutational spectra distinguish SARS-CoV-2 replication niches. bioRxiv, doi:10.1101/2022.09.27.509649). There is also a decrease in the relative rate of C→T mutations in Delta, and other subtle changes in the mutation spectrum along the phylogeny. We speculate that these changes in the mutation spectrum could arise from viral mutations that affect genome replication, packaging, and antagonization of host innate-immune factors, although environmental factors could also play a role. Interestingly, the mutation spectrum of Omicron is more similar than that of earlier SARS-CoV-2 clades to the spectrum that shaped the long-term evolution of sarbecoviruses. Overall, our work shows that the mutation process is itself a dynamic variable during SARS-CoV-2 evolution and suggests that human SARS-CoV-2 may be trending toward a mutation spectrum more similar to that of other animal sarbecoviruses.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, WA
- Department of Genome Sciences, University of Washington, Seattle, WA
- Howard Hughes Medical Institute, Seattle, WA
| | | | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Kelley Harris
- Department of Genome Sciences, University of Washington, Seattle, WA
| |
Collapse
|
31
|
Sun Q, Zeng J, Tang K, Long H, Zhang C, Zhang J, Tang J, Xin Y, Zheng J, Sun L, Liu S, Du X. Variation in synonymous evolutionary rates in the SARS-CoV-2 genome. Front Microbiol 2023; 14:1136386. [PMID: 36970680 PMCID: PMC10034387 DOI: 10.3389/fmicb.2023.1136386] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 02/13/2023] [Indexed: 03/11/2023] Open
Abstract
IntroductionCoronavirus disease 2019 is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Influential variants and mutants of this virus continue to emerge, and more effective virus-related information is urgently required for identifying and predicting new mutants. According to earlier reports, synonymous substitutions were considered phenotypically silent; thus, such mutations were frequently ignored in studies of viral mutations because they did not directly cause amino acid changes. However, recent studies have shown that synonymous substitutions are not completely silent, and their patterns and potential functional correlations should thus be delineated for better control of the pandemic.MethodsIn this study, we estimated the synonymous evolutionary rate (SER) across the SARS-CoV-2 genome and used it to infer the relationship between the viral RNA and host protein. We also assessed the patterns of characteristic mutations found in different viral lineages.ResultsWe found that the SER varies across the genome and that the variation is primarily influenced by codon-related factors. Moreover, the conserved motifs identified based on the SER were found to be related to host RNA transport and regulation. Importantly, the majority of the existing fixed-characteristic mutations for five important virus lineages (Alpha, Beta, Gamma, Delta, and Omicron) were significantly enriched in partially constrained regions.DiscussionTaken together, our results provide unique information on the evolutionary and functional dynamics of SARS-CoV-2 based on synonymous mutations and offer potentially useful information for better control of the SARS-CoV-2 pandemic.
Collapse
Affiliation(s)
- Qianru Sun
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Jinfeng Zeng
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Kang Tang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Haoyu Long
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Chi Zhang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Jie Zhang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Jing Tang
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Yuting Xin
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Jialu Zheng
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Litao Sun
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Siyang Liu
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
| | - Xiangjun Du
- School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, China
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, China
- Key Laboratory of Tropical Disease Control, Ministry of Education, Sun Yat-sen University, Guangzhou, China
- *Correspondence: Xiangjun Du
| |
Collapse
|
32
|
Gazeau S, Deng X, Ooi HK, Mostefai F, Hussin J, Heffernan J, Jenner AL, Craig M. The race to understand immunopathology in COVID-19: Perspectives on the impact of quantitative approaches to understand within-host interactions. IMMUNOINFORMATICS (AMSTERDAM, NETHERLANDS) 2023; 9:100021. [PMID: 36643886 PMCID: PMC9826539 DOI: 10.1016/j.immuno.2023.100021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/16/2022] [Accepted: 01/03/2023] [Indexed: 01/09/2023]
Abstract
The COVID-19 pandemic has revealed the need for the increased integration of modelling and data analysis to public health, experimental, and clinical studies. Throughout the first two years of the pandemic, there has been a concerted effort to improve our understanding of the within-host immune response to the SARS-CoV-2 virus to provide better predictions of COVID-19 severity, treatment and vaccine development questions, and insights into viral evolution and the impacts of variants on immunopathology. Here we provide perspectives on what has been accomplished using quantitative methods, including predictive modelling, population genetics, machine learning, and dimensionality reduction techniques, in the first 26 months of the COVID-19 pandemic approaches, and where we go from here to improve our responses to this and future pandemics.
Collapse
Affiliation(s)
- Sonia Gazeau
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| | - Xiaoyan Deng
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| | - Hsu Kiang Ooi
- Digital Technologies Research Centre, National Research Council Canada, Toronto, Canada
| | - Fatima Mostefai
- Montréal Heart Institute Research Centre, Montréal, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, Canada
| | - Julie Hussin
- Montréal Heart Institute Research Centre, Montréal, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, Canada
| | - Jane Heffernan
- Modelling Infection and Immunity Lab, Mathematics Statistics, York University, Toronto, Canada
- Centre for Disease Modelling (CDM), Mathematics Statistics, York University, Toronto, Canada
| | - Adrianne L Jenner
- School of Mathematical Sciences, Queensland University of Technology, Brisbane Australia
| | - Morgan Craig
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| |
Collapse
|
33
|
Evaluating Data Sharing of SARS-CoV-2 Genomes for Molecular Epidemiology across the COVID-19 Pandemic. Viruses 2023; 15:v15020560. [PMID: 36851774 PMCID: PMC9959893 DOI: 10.3390/v15020560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/12/2023] [Accepted: 02/16/2023] [Indexed: 02/22/2023] Open
Abstract
Following the emergence of COVID-19 in December 2019, caused by the coronavirus SARS-CoV-2, the disease spread dramatically worldwide. The use of genomics to trace the dissemination of the virus and the identification of novel variants was essential in defining measures for containing the disease. We aim to evaluate the global effort to genomically characterize the circulating lineages of SARS-CoV-2, considering the data deposited in GISAID, the major platform for data sharing in a massive worldwide collaborative undertaking. We contextualize data for nearly three years (January 2020-October 2022) for the major contributing countries, percentage of characterized isolates and time for data processing in the context of the global pandemic. Within this collaborative effort, we also evaluated the early detection of seven major SARS-CoV-2 lineages, G, GR, GH, GK, GV, GRY and GRA. While Europe and the USA, following an initial period, showed positive results across time in terms of cases sequenced and time for data deposition, this effort is heterogeneous worldwide. Given the current immunization the major threat is the appearance of variants that evade the acquired immunity. In that scenario, the monitoring of those hypothetical variants will still play an essential role.
Collapse
|
34
|
Correlated substitutions reveal SARS-like coronaviruses recombine frequently with a diverse set of structured gene pools. Proc Natl Acad Sci U S A 2023; 120:e2206945119. [PMID: 36693089 PMCID: PMC9945976 DOI: 10.1073/pnas.2206945119] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Quantifying SARS-like coronavirus (SL-CoV) evolution is critical to understanding the origins of SARS-CoV-2 and the molecular processes that could underlie future epidemic viruses. While genomic analyses suggest recombination was a factor in the emergence of SARS-CoV-2, few studies have quantified recombination rates among SL-CoVs. Here, we infer recombination rates of SL-CoVs from correlated substitutions in sequencing data using a coalescent model with recombination. Our computationally-efficient, non-phylogenetic method infers recombination parameters of both sampled sequences and the unsampled gene pools with which they recombine. We apply this approach to infer recombination parameters for a range of positive-sense RNA viruses. We then analyze a set of 191 SL-CoV sequences (including SARS-CoV-2) and find that ORF1ab and S genes frequently undergo recombination. We identify which SL-CoV sequence clusters have recombined with shared gene pools, and show that these pools have distinct structures and high recombination rates, with multiple recombination events occurring per synonymous substitution. We find that individual genes have recombined with different viral reservoirs. By decoupling contributions from mutation and recombination, we recover the phylogeny of non-recombined portions for many of these SL-CoVs, including the position of SARS-CoV-2 in this clonal phylogeny. Lastly, by analyzing >400,000 SARS-CoV-2 whole genome sequences, we show current diversity levels are insufficient to infer the within-population recombination rate of the virus since the pandemic began. Our work offers new methods for inferring recombination rates in RNA viruses with implications for understanding recombination in SARS-CoV-2 evolution and the structure of clonal relationships and gene pools shaping its origins.
Collapse
|
35
|
Focosi D, Quiroga R, McConnell S, Johnson MC, Casadevall A. Convergent Evolution in SARS-CoV-2 Spike Creates a Variant Soup from Which New COVID-19 Waves Emerge. Int J Mol Sci 2023; 24:2264. [PMID: 36768588 PMCID: PMC9917121 DOI: 10.3390/ijms24032264] [Citation(s) in RCA: 71] [Impact Index Per Article: 35.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 12/27/2022] [Accepted: 12/27/2022] [Indexed: 01/26/2023] Open
Abstract
The first 2 years of the COVID-19 pandemic were mainly characterized by recurrent mutations of SARS-CoV-2 Spike protein at residues K417, L452, E484, N501 and P681 emerging independently across different variants of concern (Alpha, Beta, Gamma, and Delta). Such homoplasy is a marker of convergent evolution. Since Spring 2022 and the third year of the pandemic, with the advent of Omicron and its sublineages, convergent evolution has led to the observation of different lineages acquiring an additional group of mutations at different amino acid residues, namely R346, K444, N450, N460, F486, F490, Q493, and S494. Mutations at these residues have become increasingly prevalent during Summer and Autumn 2022, with combinations showing increased fitness. The most likely reason for this convergence is the selective pressure exerted by previous infection- or vaccine-elicited immunity. Such accelerated evolution has caused failure of all anti-Spike monoclonal antibodies, including bebtelovimab and cilgavimab. While we are learning how fast coronaviruses can mutate and recombine, we should reconsider opportunities for economically sustainable escape-proof combination therapies, and refocus antibody-mediated therapeutic efforts on polyclonal preparations that are less likely to allow for viral immune escape.
Collapse
Affiliation(s)
- Daniele Focosi
- North-Western Tuscany Blood Bank, Pisa University Hospital, 56124 Pisa, Italy
| | - Rodrigo Quiroga
- Instituto de Investigaciones en Físico-Química de Córdoba (INFIQC-CONICET), Facultad de Ciencias Químicas, Universidad Nacional de Córdoba, Cordova 5000, Argentina
| | - Scott McConnell
- Department of Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| | - Marc C. Johnson
- Department of Molecular Microbiology and Immunology, University of Missouri School of Medicine, Columbia, MO 65201, USA
| | - Arturo Casadevall
- Department of Molecular Microbiology and Immunology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| |
Collapse
|
36
|
Bai H, Ata G, Sun Q, Rahman SU, Tao S. Natural selection pressure exerted on "Silent" mutations during the evolution of SARS-CoV-2: Evidence from codon usage and RNA structure. Virus Res 2023; 323:198966. [PMID: 36244617 PMCID: PMC9561399 DOI: 10.1016/j.virusres.2022.198966] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/08/2022] [Accepted: 10/10/2022] [Indexed: 01/25/2023]
Abstract
From the first emergence of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) till now, multiple mutations that caused synonymous and nonsynonymous substitutions have accumulated. Among them, synonymous substitutions were regarded as "silent" mutations that received less attention than nonsynonymous substitutions that cause amino acid variations. However, the importance of synonymous substitutions can not be neglected. This research focuses on synonymous substitutions on SARS-CoV-2 and proves that synonymous substitutions were under purifying selection in its evolution. The evidence of purifying selection is provided by comparing the mutation number per site in coding and non-coding regions. We then study the two forces of purifying selection: synonymous codon usage and RNA secondary structure. Results show that the codon usage optimization leads to an adapted codon usage towards humans. Furthermore, our results show that the maintenance of RNA secondary structure causes the purifying of synonymous substitutions in the structural region. These results explain the selection pressure on synonymous substitutions during the evolution of SARS-CoV-2.
Collapse
Affiliation(s)
- Haoxiang Bai
- College of Life Sciences, Northwest A&F University, Yangling, China; Bioinformatics Center, Northwest A&F University, Yangling, China
| | - Galal Ata
- College of Life Sciences, Northwest A&F University, Yangling, China; Bioinformatics Center, Northwest A&F University, Yangling, China
| | - Qing Sun
- College of Life Sciences, Northwest A&F University, Yangling, China; Bioinformatics Center, Northwest A&F University, Yangling, China
| | - Siddiq Ur Rahman
- Department of Computer Science and Bioinformatics, Khushal Khan Khattak University, Karak, Khyber Pakhtunkhwa, Pakistan
| | - Shiheng Tao
- College of Life Sciences, Northwest A&F University, Yangling, China; Bioinformatics Center, Northwest A&F University, Yangling, China.
| |
Collapse
|
37
|
Lim Kai Rong M, Kuruoglu EE, Chan WKV. Modeling SARS-CoV-2 nucleotide mutations as a stochastic process. PLoS One 2023; 18:e0284874. [PMID: 37115784 PMCID: PMC10146438 DOI: 10.1371/journal.pone.0284874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
This study analyzes the SARS-CoV-2 genome sequence mutations by modeling its nucleotide mutations as a stochastic process in both the time-series and spatial domain of the gene sequence. In the time-series model, a Markov Chain embedded Poisson random process characterizes the mutation rate matrix, while the spatial gene sequence model delineates the distribution of mutation inter-occurrence distances. Our experiment focuses on five key variants of concern that had become a global concern due to their high transmissibility and virulence. The time-series results reveal distinct asymmetries in mutation rate and propensities among different nucleotides and across different strains, with a mean mutation rate of approximately 2 mutations per month. In particular, our spatial gene sequence results reveal some novel biological insights on the characteristic distribution of mutation inter-occurrence distances, which display a notable pattern similar to other natural diseases. Our findings contribute interesting insights to the underlying biological mechanism of SARS-CoV-2 mutations, bringing us one step closer to improving the accuracy of existing mutation prediction models. This research could also potentially pave the way for future work in adopting similar spatial random process models and advanced spatial pattern recognition algorithms in order to characterize mutations on other different kinds of virus families.
Collapse
|
38
|
Bousali M, Pogka V, Vatsellas G, Loupis T, Athanasiadis EI, Zoi K, Thanos D, Paraskevis D, Mentis A, Karamitros T. Tracing the First Days of the SARS-CoV-2 Pandemic in Greece and the Role of the First Imported Group of Travelers. Microbiol Spectr 2022; 10:e0213422. [PMID: 36409093 PMCID: PMC9769540 DOI: 10.1128/spectrum.02134-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 10/31/2022] [Indexed: 11/23/2022] Open
Abstract
The first SARS-CoV-2 case in Greece was confirmed on February 26, 2020, and since then, multiple strains have circulated the country, leading to regional and country-wide outbreaks. Our aim is to enlighten the events that took place during the first days of the SARS-CoV-2 pandemic in Greece, focusing on the role of the first imported group of travelers. We used whole-genome SARS-CoV-2 sequences obtained from the infected travelers of the group as well as Greece-derived and globally subsampled sequences and applied dedicated phylogenetics and phylodynamics tools as well as in-house-developed bioinformatics pipelines. Our analyses reveal the genetic variants circulating in Greece during the first days of the pandemic and the role of the group's imported strains in the course of the first pandemic wave in Greece. The strain that dominated in Greece throughout the first wave, bearing the D614G mutation, was primarily imported from a certain group of travelers, while molecular and clinical data suggest that the infection of the travelers occurred in Egypt. Founder effects early in the pandemic are important for the success of certain strains, as those arriving early, several times, and to diverse locations lead to the formation of large transmission clusters that can be estimated using molecular epidemiology approaches and can be a useful surveillance tool for the prioritization of nonpharmaceutical interventions and combating present and future outbreaks. IMPORTANCE The strain that dominated in Greece during the first pandemic wave was primarily imported from a group of returning travelers in February 2020, while molecular and clinical data suggest that the origin of the transmission was Egypt. The observed molecular transmission clusters reflect the transmission dynamics of this particular strain bearing the D614G mutation while highlighting the necessity of their use as a surveillance tool for the prioritization of nonpharmaceutical interventions and combating present and future outbreaks.
Collapse
Affiliation(s)
- Maria Bousali
- Bioinformatics and Applied Genomics Unit, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
| | - Vasiliki Pogka
- Bioinformatics and Applied Genomics Unit, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
- Laboratory of Medical Microbiology, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
| | - Giannis Vatsellas
- Greek Genome Center, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Theodoros Loupis
- Greek Genome Center, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
- Haematology Research Laboratory, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Emmanouil I. Athanasiadis
- Greek Genome Center, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Katerina Zoi
- Greek Genome Center, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
- Haematology Research Laboratory, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Dimitris Thanos
- Greek Genome Center, Biomedical Research Foundation of the Academy of Athens (BRFAA), Athens, Greece
| | - Dimitrios Paraskevis
- Department of Hygiene Epidemiology and Medical Statistics, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Andreas Mentis
- Laboratory of Medical Microbiology, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
| | - Timokratis Karamitros
- Bioinformatics and Applied Genomics Unit, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
- Laboratory of Medical Microbiology, Department of Microbiology, Hellenic Pasteur Institute, Athens, Greece
| |
Collapse
|
39
|
Pickering B, Lung O, Maguire F, Kruczkiewicz P, Kotwa JD, Buchanan T, Gagnier M, Guthrie JL, Jardine CM, Marchand-Austin A, Massé A, McClinchey H, Nirmalarajah K, Aftanas P, Blais-Savoie J, Chee HY, Chien E, Yim W, Banete A, Griffin BD, Yip L, Goolia M, Suderman M, Pinette M, Smith G, Sullivan D, Rudar J, Vernygora O, Adey E, Nebroski M, Goyette G, Finzi A, Laroche G, Ariana A, Vahkal B, Côté M, McGeer AJ, Nituch L, Mubareka S, Bowman J. Divergent SARS-CoV-2 variant emerges in white-tailed deer with deer-to-human transmission. Nat Microbiol 2022; 7:2011-2024. [PMID: 36357713 PMCID: PMC9712111 DOI: 10.1038/s41564-022-01268-9] [Citation(s) in RCA: 105] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 10/13/2022] [Indexed: 11/12/2022]
Abstract
Wildlife reservoirs of broad-host-range viruses have the potential to enable evolution of viral variants that can emerge to infect humans. In North America, there is phylogenomic evidence of continual transmission of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) from humans to white-tailed deer (Odocoileus virginianus) through unknown means, but no evidence of transmission from deer to humans. We carried out an observational surveillance study in Ontario, Canada during November and December 2021 (n = 300 deer) and identified a highly divergent lineage of SARS-CoV-2 in white-tailed deer (B.1.641). This lineage is one of the most divergent SARS-CoV-2 lineages identified so far, with 76 mutations (including 37 previously associated with non-human mammalian hosts). From a set of five complete and two partial deer-derived viral genomes we applied phylogenomic, recombination, selection and mutation spectrum analyses, which provided evidence for evolution and transmission in deer and a shared ancestry with mink-derived virus. Our analysis also revealed an epidemiologically linked human infection. Taken together, our findings provide evidence for sustained evolution of SARS-CoV-2 in white-tailed deer and of deer-to-human transmission.
Collapse
Affiliation(s)
- Bradley Pickering
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada.
- Department of Veterinary Microbiology and Preventative Medicine, College of Veterinary Medicine, Iowa State University, Ames, IA, USA.
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Manitoba, Canada.
| | - Oliver Lung
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
- Department of Biological Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Community Health & Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada
- Shared Hospital Laboratory, Toronto, Ontario, Canada
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Peter Kruczkiewicz
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | | | - Tore Buchanan
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada
| | - Marianne Gagnier
- Ministère des Forêts, de la Faune et des Parcs, Quebec City, Quebec, Canada
| | - Jennifer L Guthrie
- Public Health Ontario, Toronto, Ontario, Canada
- Department of Microbiology & Immunology, Western University, London, Toronto, Ontario, Canada
| | - Claire M Jardine
- Canadian Wildlife Health Cooperative, Ontario-Nunavut, Department of Pathobiology, University of Guelph, Guelph, Ontario, Canada
| | | | - Ariane Massé
- Ministère des Forêts, de la Faune et des Parcs, Quebec City, Quebec, Canada
| | - Heather McClinchey
- Public Health, Health Protection and Surveillance Policy and Programs Branch, Ontario Ministry of Health, Toronto, Ontario, Canada
| | | | | | | | | | - Emily Chien
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Winfield Yim
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Andra Banete
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | | | - Lily Yip
- Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Melissa Goolia
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Matthew Suderman
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Mathieu Pinette
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Greg Smith
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Daniel Sullivan
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
- Department of Biological Sciences, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Josip Rudar
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Oksana Vernygora
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | - Elizabeth Adey
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada
| | - Michelle Nebroski
- National Centre for Foreign Animal Disease, Canadian Food Inspection Agency, Winnipeg, Manitoba, Canada
| | | | - Andrés Finzi
- Centre de Recherche du CHUM, Montréal, Quebec, Canada
- Département de Microbiologie, Infectiologie et Immunologie, Université de Montréal, Montréal, Quebec, Canada
| | - Geneviève Laroche
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- Centre for Infection, Immunity, and Inflammation, University of Ottawa, Ottawa, Ontario, Canada
| | - Ardeshir Ariana
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- Centre for Infection, Immunity, and Inflammation, University of Ottawa, Ottawa, Ontario, Canada
| | - Brett Vahkal
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- Centre for Infection, Immunity, and Inflammation, University of Ottawa, Ottawa, Ontario, Canada
| | - Marceline Côté
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- Centre for Infection, Immunity, and Inflammation, University of Ottawa, Ottawa, Ontario, Canada
| | - Allison J McGeer
- Sinai Health System, Toronto, Ontario, Canada
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
| | - Larissa Nituch
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada
| | - Samira Mubareka
- Sunnybrook Research Institute, Toronto, Ontario, Canada.
- Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada.
| | - Jeff Bowman
- Wildlife Research and Monitoring Section, Ontario Ministry of Natural Resources and Forestry, Peterborough, Ontario, Canada.
- Environmental and Life Sciences Graduate Program, Trent University, Peterborough, Ontario, Canada.
| |
Collapse
|
40
|
Bloom JD, Beichman AC, Neher RA, Harris K. Evolution of the SARS-CoV-2 mutational spectrum. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.11.19.517207. [PMID: 36451887 PMCID: PMC9709787 DOI: 10.1101/2022.11.19.517207] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
SARS-CoV-2 evolves rapidly in part because of its high mutation rate. Here we examine whether this mutational process itself has changed during viral evolution. To do this, we quantify the relative rates of different types of single nucleotide mutations at four-fold degenerate sites in the viral genome across millions of human SARS-CoV-2 sequences. We find clear shifts in the relative rates of several types of mutations during SARS-CoV-2 evolution. The most striking trend is a roughly two-fold decrease in the relative rate of G→T mutations in Omicron versus early clades, as was recently noted by Ruis et al (2022). There is also a decrease in the relative rate of C→T mutations in Delta, and other subtle changes in the mutation spectrum along the phylogeny. We speculate that these changes in the mutation spectrum could arise from viral mutations that affect genome replication, packaging, and antagonization of host innate-immune factors-although environmental factors could also play a role. Interestingly, the mutation spectrum of Omicron is more similar than that of earlier SARS-CoV-2 clades to the spectrum that shaped the long-term evolution of sarbecoviruses. Overall, our work shows that the mutation process is itself a dynamic variable during SARS-CoV-2 evolution, and suggests that human SARS-CoV-2 may be trending towards a mutation spectrum more similar to that of other animal sarbecoviruses.
Collapse
Affiliation(s)
- Jesse D Bloom
- Basic Sciences Division and Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, Washington, USA
- Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Annabel C Beichman
- Department of Genome Sciences & Medical Scientist Training Program, University of Washington, Seattle, Washington, USA
| | - Richard A Neher
- Biozentrum, University of Basel, Basel, Switzerland, Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | |
Collapse
|
41
|
Evolutionary Pattern Comparisons of the SARS-CoV-2 Delta Variant in Countries/Regions with High and Low Vaccine Coverage. Viruses 2022; 14:v14102296. [PMID: 36298851 PMCID: PMC9611485 DOI: 10.3390/v14102296] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/09/2022] [Accepted: 10/10/2022] [Indexed: 11/16/2022] Open
Abstract
It has been argued that vaccine-breakthrough infections of SARS-CoV-2 would likely accelerate the emergence of novel variants with immune evasion. This study explored the evolutionary patterns of the Delta variant in countries/regions with relatively high and low vaccine coverage based on large-scale sequences. Our results showed that (i) the sequences were grouped into two clusters (L and R); the R cluster was dominant, its proportion increased over time and was higher in the high-vaccine-coverage areas; (ii) genetic diversities in the countries/regions with low vaccine coverage were higher than those in the ones with high vaccine coverage; (iii) unique mutations and co-mutations were detected in different countries/regions; in particular, common co-mutations were exhibited in highly occurring frequencies in the areas with high vaccine coverage and presented in increasing frequencies over time in the areas with low vaccine coverage; (iv) five sites on the S protein were under strong positive selection in different countries/regions, with three in non-C to U sites (I95T, G142D and T950N), and the occurring frequencies of I95T in high vaccine coverage areas were higher, while G142D and T950N were potentially immune-pressure-selected sites; and (v) mutation at the N6-methyladenosine site 4 on ORF7a (C27527T, P45L) was detected and might be caused by immune pressure. Our study suggested that certain variation differences existed between countries/regions with high and low vaccine coverage, but they were not likely caused by host immune pressure. We inferred that no extra immune pressures on SARS-CoV-2 were generated with high vaccine coverage, and we suggest promoting and strengthening the uptake of the COVID-19 vaccine worldwide, especially in less developed areas.
Collapse
|
42
|
Wade KJ, Tisa S, Barrington C, Henriksen JC, Crooks KR, Gignoux CR, Almand AT, Steel JJ, Sitko JC, Rohrer JW, Wickert DP, Almand EA, Pollock DD, Rissland OS. Phylodynamics of a regional SARS-CoV-2 rapid spreading event in Colorado in late 2020. PLoS One 2022; 17:e0274050. [PMID: 36194597 PMCID: PMC9531818 DOI: 10.1371/journal.pone.0274050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2022] [Accepted: 08/20/2022] [Indexed: 11/07/2022] Open
Abstract
Since the initial reported discovery of SARS-CoV-2 in late 2019, genomic surveillance has been an important tool to understand its transmission and evolution. Here, we sought to describe the underlying regional phylodynamics before and during a rapid spreading event that was documented by surveillance protocols of the United States Air Force Academy (USAFA) in late October-November of 2020. We used replicate long-read sequencing on Colorado SARS-CoV-2 genomes collected July through November 2020 at the University of Colorado Anschutz Medical campus in Aurora and the United States Air Force Academy in Colorado Springs. Replicate sequencing allowed rigorous validation of variation and placement in a phylogenetic relatedness network. We focus on describing the phylodynamics of a lineage that likely originated in the local Colorado Springs community and expanded rapidly over the course of two months in an outbreak within the well-controlled environment of the United States Air Force Academy. Divergence estimates from sampling dates indicate that the SARS-CoV-2 lineage associated with this rapid expansion event originated in late October 2020. These results are in agreement with transmission pathways inferred by the United States Air Force Academy, and provide a window into the evolutionary process and transmission dynamics of a potentially dangerous but ultimately contained variant.
Collapse
Affiliation(s)
- Kristen J. Wade
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Samantha Tisa
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Chloe Barrington
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Jesslyn C. Henriksen
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Kristy R. Crooks
- Colorado Center for Personalized Medicine, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Christopher R. Gignoux
- Colorado Center for Personalized Medicine, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Austin T. Almand
- Department of Biology, United States Air Force, Colorado Springs, Colorado, United States of America
| | - J. Jordan Steel
- Department of Biology, United States Air Force, Colorado Springs, Colorado, United States of America
| | - John C. Sitko
- Department of Biology, United States Air Force, Colorado Springs, Colorado, United States of America
| | - Joseph W. Rohrer
- Colorado Center for Personalized Medicine, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Douglas P. Wickert
- Department of Biology, United States Air Force, Colorado Springs, Colorado, United States of America
| | - Erin A. Almand
- Department of Biology, United States Air Force, Colorado Springs, Colorado, United States of America
| | - David D. Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Olivia S. Rissland
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| |
Collapse
|
43
|
Silva TDS, Salvato RS, Gregianini TS, Gomes IA, Pereira EC, de Oliveira E, de Menezes AL, Barcellos RB, Godinho FM, Riediger I, Debur MDC, de Oliveira CM, Ribeiro-Rodrigues R, Miyajima F, Dias FS, Abbud A, do Monte-Neto R, Calzavara-Silva CE, Siqueira MM, Wallau GL, Resende PC, Fernandes GDR, Alves P. Molecular characterization of a new SARS-CoV-2 recombinant cluster XAG identified in Brazil. Front Med (Lausanne) 2022; 9:1008600. [PMID: 36250091 PMCID: PMC9554242 DOI: 10.3389/fmed.2022.1008600] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 09/02/2022] [Indexed: 11/13/2022] Open
Abstract
Recombination events have been described in the Coronaviridae family. Since the beginning of the SARS-CoV-2 pandemic, a variable degree of selection pressure has acted upon the virus, generating new strains with increased fitness in terms of viral transmission and antibody scape. Most of the SC2 variants of concern (VOC) detected so far carry a combination of key amino acid changes and indels. Recombination may also reshuffle existing genetic profiles of distinct strains, potentially giving origin to recombinant strains with altered phenotypes. However, co-infection and recombination events are challenging to detect and require in-depth curation of assembled genomes and sequencing reds. Here, we present the molecular characterization of a new SARS-CoV-2 recombinant between BA.1.1 and BA.2.23 Omicron lineages identified in Brazil. We characterized four mutations that had not been previously described in any of the recombinants already identified worldwide and described the likely breaking points. Moreover, through phylogenetic analysis, we showed that the newly named XAG lineage groups in a highly supported monophyletic clade confirmed its common evolutionary history from parental Omicron lineages and other recombinants already described. These observations were only possible thanks to the joint effort of bioinformatics tools auxiliary in genomic surveillance and the manual curation of experienced personnel, demonstrating the importance of genetic, and bioinformatic knowledge in genomics.
Collapse
Affiliation(s)
| | | | | | | | | | - Eneida de Oliveira
- Laboratório Municipal de Referência, Setor de Biologia Molecular, Belo Horizonte, Brazil
| | - André Luiz de Menezes
- Laboratório Municipal de Referência, Setor de Biologia Molecular, Belo Horizonte, Brazil
| | | | | | - Irina Riediger
- Laboratório Central de Saúde Pública do Estado do Paraná, Curitiba, Brazil
| | | | | | | | | | | | | | | | | | | | - Gabriel Luz Wallau
- Instituto Aggeu Magalhães, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | | | | | - Pedro Alves
- Instituto René Rachou, Fundação Oswaldo Cruz, Belo Horizonte, Brazil
| |
Collapse
|
44
|
Turakhia Y, Thornlow B, Hinrichs A, McBroome J, Ayala N, Ye C, Smith K, De Maio N, Haussler D, Lanfear R, Corbett-Detig R. Pandemic-scale phylogenomics reveals the SARS-CoV-2 recombination landscape. Nature 2022; 609:994-997. [PMID: 35952714 PMCID: PMC9519458 DOI: 10.1038/s41586-022-05189-9] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 08/03/2022] [Indexed: 11/29/2022]
Abstract
Accurate and timely detection of recombinant lineages is crucial for interpreting genetic variation, reconstructing epidemic spread, identifying selection and variants of interest, and accurately performing phylogenetic analyses1-4. During the SARS-CoV-2 pandemic, genomic data generation has exceeded the capacities of existing analysis platforms, thereby crippling real-time analysis of viral evolution5. Here, we use a new phylogenomic method to search a nearly comprehensive SARS-CoV-2 phylogeny for recombinant lineages. In a 1.6 million sample tree from May 2021, we identify 589 recombination events, which indicate that around 2.7% of sequenced SARS-CoV-2 genomes have detectable recombinant ancestry. Recombination breakpoints are inferred to occur disproportionately in the 3' portion of the genome that contains the spike protein. Our results highlight the need for timely analyses of recombination for pinpointing the emergence of recombinant lineages with the potential to increase transmissibility or virulence of the virus. We anticipate that this approach will empower comprehensive real-time tracking of viral recombination during the SARS-CoV-2 pandemic and beyond.
Collapse
Affiliation(s)
- Yatish Turakhia
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, USA.
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Angie Hinrichs
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Jakob McBroome
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Nicolas Ayala
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Cheng Ye
- Department of Electrical and Computer Engineering, University of California, San Diego, San Diego, CA, USA
| | - Kyle Smith
- Department of Biological Sciences, University of California, San Diego, San Diego, CA, USA
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, UK
| | - David Haussler
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
- Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.
- Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
45
|
Ari E, Vásárhelyi BM, Kemenesi G, Tóth GE, Zana B, Somogyi B, Lanszki Z, Röst G, Jakab F, Papp B, Kintses B. A Single Early Introduction Governed Viral Diversity in the Second Wave of SARS-CoV-2 Epidemic in Hungary. Virus Evol 2022; 8:veac069. [PMID: 35996591 PMCID: PMC9384595 DOI: 10.1093/ve/veac069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 06/28/2022] [Accepted: 07/26/2022] [Indexed: 11/30/2022] Open
Abstract
Retrospective evaluation of past waves of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic is key for designing optimal interventions against future waves and novel pandemics. Here, we report on analysing genome sequences of SARS-CoV-2 from the first two waves of the epidemic in 2020 in Hungary, mirroring a suppression and a mitigation strategy, respectively. Our analysis reveals that the two waves markedly differed in viral diversity and transmission patterns. Specifically, unlike in several European areas or in the USA, we have found no evidence for early introduction and cryptic transmission of the virus in the first wave of the pandemic in Hungary. Despite the introduction of multiple viral lineages, extensive community spread was prevented by a timely national lockdown in March 2020. In sharp contrast, the majority of the cases in the much larger second wave can be linked to a single transmission lineage of the pan-European B.1.160 variant. This lineage was introduced unexpectedly early, followed by a 2-month-long cryptic transmission before a soar of detected cases in September 2020. Epidemic analysis has revealed that the dominance of this lineage in the second wave was not associated with an intrinsic transmission advantage. This finding is further supported by the rapid replacement of B.1.160 by the alpha variant (B.1.1.7) that launched the third wave of the epidemic in February 2021. Overall, these results illustrate how the founder effect in combination with the cryptic transmission, instead of repeated international introductions or higher transmissibility, can govern viral diversity.
Collapse
Affiliation(s)
- Eszter Ari
- HCEMM-BRC Metabolic Systems Biology Research Group , Temesvári krt. 62, 6726, Szeged, Hungary
- Synthetic and System Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
- Department of Genetics, ELTE Eötvös Loránd University , Pázmány Péter sétány 1/C 1117, Budapest, Hungary
| | - Bálint Márk Vásárhelyi
- Synthetic and System Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
- National Laboratory of Biotechnology, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
| | - Gábor Kemenesi
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Gábor Endre Tóth
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Brigitta Zana
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Balázs Somogyi
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Zsófia Lanszki
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Gergely Röst
- National Laboratory for Health Security, Bolyai Institute, University of Szeged , Aradi vértanúk tere 1, 6720 Szeged, Hungary
| | - Ferenc Jakab
- National Laboratory of Virology, Virological Research Group, Szentágothai Research Centre, University of Pécs , Ifjúság útja 20, 7624, Pécs, Hungary
- Faculty of Sciences, Institute of Biology, University of Pécs , Ifjúság útja 6, 7624, Pécs, Hungary
| | - Balázs Papp
- HCEMM-BRC Metabolic Systems Biology Research Group , Temesvári krt. 62, 6726, Szeged, Hungary
- Synthetic and System Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
- National Laboratory of Biotechnology, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
| | - Bálint Kintses
- HCEMM-BRC Translational Microbiology Research Group , Temesvári krt. 62, 6726, Szeged, Hungary
- Synthetic and System Biology Unit, Institute of Biochemistry, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
- National Laboratory of Biotechnology, Biological Research Centre, Eötvös Loránd Research Network (ELKH) , Temesvári krt. 62, 6726, Szeged, Hungary
- Department of Biochemistry and Molecular Biology, Faculty of Science and Informatics, University of Szeged , Közép fasor 52, 6726, Szeged, Hungary
| |
Collapse
|
46
|
Chakraborty C, Bhattacharya M, Sharma AR, Dhama K, Lee SS. Continent-wide evolutionary trends of emerging SARS-CoV-2 variants: dynamic profiles from Alpha to Omicron. GeroScience 2022; 44:2371-2392. [PMID: 35831773 PMCID: PMC9281186 DOI: 10.1007/s11357-022-00619-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 06/27/2022] [Indexed: 01/06/2023] Open
Abstract
The ongoing SARS-CoV-2 evolution process has generated several variants due to its continuous mutations, making pandemics more critical. The present study illustrates SARS-CoV-2 evolution and its emerging mutations in five directions. First, the significant mutations in the genome and S-glycoprotein were analyzed in different variants. Three linear models were developed with the regression line to depict the mutational load for S-glycoprotein, total genome excluding S-glycoprotein, and whole genome. Second, the continent-wide evolution of SARS-CoV-2 and its variants with their clades and divergence were evaluated. It showed the region-wise evolution of the SARS-CoV-2 variants and their clustering event. The major clades for each variant were identified. One example is clade 21K, a major clade of the Omicron variant. Third, lineage dynamics and comparison between SARS-CoV-2 lineages across different countries are also illustrated, demonstrating dominant variants in various countries over time. Fourth, gene-wise mutation patterns and genetic variability of SARS-CoV-2 variants across various countries are illustrated. High mutation patterns were found in the ORF10, ORF6, S, and low mutation pattern E genes. Finally, emerging AA point mutations (T478K, L452R, N501Y, S477N, E484A, Q498R, and Y505H), their frequencies, and country-wise occurrence were identified, and the highest event of two mutations (T478K and L452R) was observed.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal 700126 India
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore, 756020 Odisha India
| | - Ashish Ranjan Sharma
- Institute for Skeletal Aging & Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252 Gangwon-do Republic of Korea
| | - Kuldeep Dhama
- Division of Pathology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, 243122 Uttar Pradesh India
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon-si, 24252 Gangwon-do Republic of Korea
| |
Collapse
|
47
|
Morawiec E, Miklasińska-Majdanik M, Bratosiewicz-Wąsik J, Wojtyczka RD, Swolana D, Stolarek I, Czerwiński M, Skubis-Sikora A, Samul M, Polak A, Kruszniewska-Rajs C, Pudełko A, Figlerowicz M, Bednarska-Czerwińska A, Wąsik TJ. From Alpha to Delta-Genetic Epidemiology of SARS-CoV-2 (hCoV-19) in Southern Poland. Pathogens 2022; 11:780. [PMID: 35890025 PMCID: PMC9316897 DOI: 10.3390/pathogens11070780] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 06/29/2022] [Accepted: 07/05/2022] [Indexed: 02/04/2023] Open
Abstract
In Poland, the first case of SARS-CoV-2 infection was confirmed in March 2020. Since then, many circulating virus lineages fueled rapid pandemic waves which inflicted a severe burden on the Polish healthcare system. Some of these lineages were associated with increased transmissibility and immune escape. Mutations in the viral spike protein, which is responsible for host cell recognition and serves as the primary target for neutralizing antibodies, are of particular importance. We investigated the molecular epidemiology of the SARS-CoV-2 clades circulating in Southern Poland from February 2021 to August 2021. The 921 whole-genome sequences were used for variant identification, spike mutation, and phylogenetic analyses. The Pango B.1.1.7 was the dominant variant (n = 730, 89.68%) from March 2021 to July 2021. In July 2021, the B.1.1.7 was displaced by the B.1.617.2 lineage with 66.66% in July 2021 and 92.3% in August 2021 frequencies, respectively. Moreover, our results were compared with the sequencing available on the GISAID platform for other regions of Poland, the Czech Republic, and Slovakia. The analysis showed that the dominant variant in the analyzed period was B.1.1.7 in all countries and Southern Poland (Silesia). Interestingly, B.1.1.7 was replaced by B.1.617.2 earlier in Southern Poland than in the rest of the country. Moreover, in the Czech Republic and Slovakia, AY lineages were predominant at that time, contrary to the Silesia region.
Collapse
Affiliation(s)
- Emilia Morawiec
- Department of Microbiology, Faculty of Medicine in Zabrze, Academy of Silesia in Katowice, 41-800 Zabrze, Poland;
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- Department of Histology, Cytophysiology and Embryology, Faculty of Medicine in Zabrze, Academy of Silesia in Katowice, 41-800 Zabrze, Poland
| | - Maria Miklasińska-Majdanik
- Department of Microbiology and Virology, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland; (M.M.-M.); (R.D.W.); (D.S.)
| | - Jolanta Bratosiewicz-Wąsik
- Department of Biopharmacy, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland;
| | - Robert D. Wojtyczka
- Department of Microbiology and Virology, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland; (M.M.-M.); (R.D.W.); (D.S.)
| | - Denis Swolana
- Department of Microbiology and Virology, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland; (M.M.-M.); (R.D.W.); (D.S.)
| | - Ireneusz Stolarek
- Department of Molecular and Systems Biology, Institute of Bioorganic Chemistry Polish Academy of Sciences, 61-704 Poznań, Poland; (I.S.); (M.F.)
| | - Michał Czerwiński
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- American Medical Clinic, 40-851 Katowice, Poland
| | - Aleksandra Skubis-Sikora
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- Department of Cytophysiology, Chair of Histology and Embryology, Faculty of Medical Sciences in Katowice, Medical University of Silesia in Katowice, 40-055 Katowice, Poland
| | - Magdalena Samul
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
| | - Agnieszka Polak
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
| | - Celina Kruszniewska-Rajs
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- Department of Molecular Biology, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland
| | - Adam Pudełko
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- Department of Clinical Chemistry and Laboratory Diagnostics, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland
| | - Marek Figlerowicz
- Department of Molecular and Systems Biology, Institute of Bioorganic Chemistry Polish Academy of Sciences, 61-704 Poznań, Poland; (I.S.); (M.F.)
| | - Anna Bednarska-Czerwińska
- Gyncentrum, Laboratory of Molecular Biology and Virology, 40-851 Katowice, Poland; (M.C.); (A.S.-S.); (M.S.); (A.P.); (C.K.-R.); (A.P.); (A.B.-C.)
- American Medical Clinic, 40-851 Katowice, Poland
- Department of Gynecology and Obstetrics, Faculty of Medicine in Zabrze, Academy of Silesia in Katowice, 41-800 Zabrze, Poland
| | - Tomasz J. Wąsik
- Department of Microbiology and Virology, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia in Katowice, 41-200 Sosnowiec, Poland; (M.M.-M.); (R.D.W.); (D.S.)
| |
Collapse
|
48
|
Pandit R, Singh I, Ansari A, Raval J, Patel Z, Dixit R, Shah P, Upadhyay K, Chauhan N, Desai K, Shah M, Modi B, Joshi M, Joshi C. First report on genome wide association study in western Indian population reveals host genetic factors for COVID-19 severity and outcome. Genomics 2022; 114:110399. [PMID: 35680011 PMCID: PMC9169419 DOI: 10.1016/j.ygeno.2022.110399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 05/18/2022] [Accepted: 06/01/2022] [Indexed: 02/07/2023]
Abstract
Different human races across the globe responded in a different way to the SARS-CoV-2 infection leading to different disease severity. Therefore, it is anticipated that host genetic factors have a straight association with the COVID-19. We identified a total 6, 7, and 6 genomic loci for deceased-recovered, asymptomatic-recovered, and deceased-asymptomatic group comparison, respectively. Unfavourable alleles of the markers nearby the genes which are associated with lung and heart diseases such as Tumor necrosis factor superfamily (TNFSF4&18), showed noteworthy association with the disease severity and outcome for the COVID-19 patients in the western Indian population. The markers found with significant association with disease prognosis or recovery are of value in determining the individual's response to SARS-CoV-2 infection and can be used for the risk prediction in COVID-19. Besides, GWAS study in other populations from India may help to strengthen the outcome of this study.
Collapse
Affiliation(s)
- Ramesh Pandit
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India
| | - Indra Singh
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India
| | - Afzal Ansari
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India
| | - Janvi Raval
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India
| | - Zarna Patel
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India
| | - Raghav Dixit
- Commissionerate of Health Medical Services and Medical Education Gandhinagar, Gujarat 382010, India
| | - Pranay Shah
- Department of Microbiology, B.J. Medical College and Civil hospital, Institute of Medical Post-Graduate Studies and Research, Ahmedabad, Gujarat 380016, India
| | - Kamlesh Upadhyay
- Department of Medicine, B.J. Medical College and Civil hospital, Institute of Medical Post-Graduate Studies and Research, Ahmedabad, Gujarat 380016, India
| | - Naresh Chauhan
- Department of Community Medicine, Government Medical College, Surat, Gujarat 395001, India
| | - Kairavi Desai
- Department of Microbiology, Government Medical College, Bhavnagar, Gujarat 364001, India
| | - Meenakshi Shah
- Department of General Medicine, GMERS Medical College & Hospital, Gotri, Vadodara, Gujarat 390021, India
| | - Bhavesh Modi
- Department of Community Medicine, GMERS Medical College, Gandhinagar, Gujarat 382012, India
| | - Madhvi Joshi
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India.
| | - Chaitanya Joshi
- Gujarat Biotechnology Research Centre (GBRC), Department of Science and Technology (Government of Gujarat), Gandhinagar, Gujarat 382011, India.
| |
Collapse
|
49
|
Balloux F, Tan C, Swadling L, Richard D, Jenner C, Maini M, van Dorp L. The past, current and future epidemiological dynamic of SARS-CoV-2. OXFORD OPEN IMMUNOLOGY 2022; 3:iqac003. [PMID: 35872966 PMCID: PMC9278178 DOI: 10.1093/oxfimm/iqac003] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 05/11/2022] [Accepted: 06/15/2022] [Indexed: 02/07/2023] Open
Abstract
SARS-CoV-2, the agent of the COVID-19 pandemic, emerged in late 2019 in China, and rapidly spread throughout the world to reach all continents. As the virus expanded in its novel human host, viral lineages diversified through the accumulation of around two mutations a month on average. Different viral lineages have replaced each other since the start of the pandemic, with the most successful Alpha, Delta and Omicron variants of concern (VoCs) sequentially sweeping through the world to reach high global prevalence. Neither Alpha nor Delta was characterized by strong immune escape, with their success coming mainly from their higher transmissibility. Omicron is far more prone to immune evasion and spread primarily due to its increased ability to (re-)infect hosts with prior immunity. As host immunity reaches high levels globally through vaccination and prior infection, the epidemic is expected to transition from a pandemic regime to an endemic one where seasonality and waning host immunization are anticipated to become the primary forces shaping future SARS-CoV-2 lineage dynamics. In this review, we consider a body of evidence on the origins, host tropism, epidemiology, genomic and immunogenetic evolution of SARS-CoV-2 including an assessment of other coronaviruses infecting humans. Considering what is known so far, we conclude by delineating scenarios for the future dynamic of SARS-CoV-2, ranging from the good-circulation of a fifth endemic 'common cold' coronavirus of potentially low virulence, the bad-a situation roughly comparable with seasonal flu, and the ugly-extensive diversification into serotypes with long-term high-level endemicity.
Collapse
Affiliation(s)
- François Balloux
- UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Cedric Tan
- UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Genome Institute of Singapore, Agency for Science, Technology and Research (A*STAR), 138672 Singapore, Singapore
| | - Leo Swadling
- Division of Infection and Immunity, University College London, London NW3 2PP, UK
| | - Damien Richard
- UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Division of Infection and Immunity, University College London, London NW3 2PP, UK
| | - Charlotte Jenner
- UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Mala Maini
- Division of Infection and Immunity, University College London, London NW3 2PP, UK
| | - Lucy van Dorp
- UCL Genetics Institute, University College London, London WC1E 6BT, UK
| |
Collapse
|
50
|
Insertion-and-Deletion Mutations between the Genomes of SARS-CoV, SARS-CoV-2, and Bat Coronavirus RaTG13. Microbiol Spectr 2022; 10:e0071622. [PMID: 35658573 PMCID: PMC9241832 DOI: 10.1128/spectrum.00716-22] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The evolutional process of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) development remains inconclusive. This study compared the genome sequences of severe acute respiratory syndrome coronavirus (SARS-CoV), bat coronavirus RaTG13, and SARS-CoV-2. In total, the genomes of SARS-CoV-2 and RaTG13 were 77.9% and 77.7% identical to the genome of SARS-CoV, respectively. A total of 3.6% (1,068 bases) of the SARS-CoV-2 genome was derived from insertion and/or deletion (indel) mutations, and 18.6% (5,548 bases) was from point mutations from the genome of SARS-CoV. At least 35 indel sites were confirmed in the genome of SARS-CoV-2, in which 17 were with ≥10 consecutive bases long. Ten of these relatively long indels were located in the spike (S) gene, five in nonstructural protein 3 (Nsp3) gene of open reading frame (ORF) 1a, and one in ORF8 and noncoding region. Seventeen (48.6%) of the 35 indels were based on insertion-and-deletion mutations with exchanged gene sequences of 7–325 consecutive bases. Almost the complete ORF8 gene was replaced by a single 325 consecutive base-long indel. The distribution of these indels was roughly in accordance with the distribution of the rate of point mutation rate around the indels. The genome sequence of SARS-CoV-2 was 96.0% identical to that of RaTG13. There was no long insertion-and-deletion mutation between the genomes of RaTG13 and SARS-CoV-2. The findings of the uneven distribution of multiple indels and the presence of multiple long insertion-and-deletion mutations with exchanged consecutive base sequences in the viral genome may provide insights into SARS-CoV-2 development. IMPORTANCE The developmental mechanism of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) remains inconclusive. This study compared the base sequence one-by-one between severe acute respiratory syndrome coronavirus (SARS-CoV) or bat coronavirus RaTG13 and SARS-CoV-2. The genomes of SARS-CoV-2 and RaTG13 were 77.9% and 77.7% identical to the genome of SARS-CoV, respectively. Seventeen of the 35 sites with insertion and/or deletion mutations between SARS-CoV-2 and SARS-CoV were based on insertion-and-deletion mutations with the replacement of 7–325 consecutive bases. Most of these long insertion-and-deletion sites were concentrated in the nonstructural protein 3 (Nsp3) gene of open reading frame (ORF) 1a, S1 domain of the spike protein, and ORF8 genes. Such long insertion-and-deletion mutations were not observed between the genomes of RaTG13 and SARS-CoV-2. The presence of multiple long insertion-and-deletion mutations in the genome of SARS-CoV-2 and their uneven distributions may provide further insights into the development of the virus.
Collapse
|