1
|
Hunter LA, Wyman S, Packel LJ, Facente SN, Li Y, Harte A, Nicolette G, Di Germanio C, Busch MP, Reingold AL, Petersen ML. Monitoring SARS-CoV-2 incidence and seroconversion among university students and employees: a longitudinal cohort study in California, June-August 2020. BMJ Open 2023; 13:e063999. [PMID: 37024253 PMCID: PMC10083519 DOI: 10.1136/bmjopen-2022-063999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/08/2023] Open
Abstract
OBJECTIVES To identify incident SARS-CoV-2 infections and inform effective mitigation strategies in university settings, we piloted an integrated symptom and exposure monitoring and testing system among a cohort of university students and employees. DESIGN Prospective cohort study. SETTING A public university in California from June to August 2020. PARTICIPANTS 2180 university students and 738 university employees. PRIMARY OUTCOME MEASURES At baseline and endline, we tested participants for active SARS-CoV-2 infection via quantitative PCR (qPCR) test and collected blood samples for antibody testing. Participants received notifications to complete additional qPCR tests throughout the study if they reported symptoms or exposures in daily surveys or were selected for surveillance testing. Viral whole genome sequencing was performed on positive qPCR samples, and phylogenetic trees were constructed with these genomes and external genomes. RESULTS Over the study period, 57 students (2.6%) and 3 employees (0.4%) were diagnosed with SARS-CoV-2 infection via qPCR test. Phylogenetic analyses revealed that a super-spreader event among undergraduates in congregate housing accounted for at least 48% of cases among study participants but did not spread beyond campus. Test positivity was higher among participants who self-reported symptoms (incidence rate ratio (IRR) 12.7; 95% CI 7.4 to 21.8) or had household exposures (IRR 10.3; 95% CI 4.8 to 22.0) that triggered notifications to test. Most (91%) participants with newly identified antibodies at endline had been diagnosed with incident infection via qPCR test during the study. CONCLUSIONS Our findings suggest that integrated monitoring systems can successfully identify and link at-risk students to SARS-CoV-2 testing. As the study took place before the evolution of highly transmissible variants and widespread availability of vaccines and rapid antigen tests, further research is necessary to adapt and evaluate similar systems in the present context.
Collapse
Affiliation(s)
- Lauren A Hunter
- School of Public Health, University of California, Berkeley, California, USA
| | - Stacia Wyman
- Innovative Genomics Institute, University of California, Berkeley, California, USA
| | - Laura J Packel
- School of Public Health, University of California, Berkeley, California, USA
| | - Shelley N Facente
- School of Public Health, University of California, Berkeley, California, USA
- Facente Consulting, Richmond, California, USA
| | - Yi Li
- School of Public Health, University of California, Berkeley, California, USA
| | - Anna Harte
- University Health Services, University of California, Berkeley, California, USA
| | - Guy Nicolette
- University Health Services, University of California, Berkeley, California, USA
| | | | - Michael P Busch
- Vitalant Research Institute, San Francisco, California, USA
- Department of Laboratory Medicine, University of California, San Francisco, California, USA
| | - Arthur L Reingold
- School of Public Health, University of California, Berkeley, California, USA
| | - Maya L Petersen
- School of Public Health, University of California, Berkeley, California, USA
| |
Collapse
|
2
|
SARS-CoV-2 genome variations and evolution patterns in Egypt: a multi-center study. Sci Rep 2022; 12:14511. [PMID: 36008511 PMCID: PMC9403952 DOI: 10.1038/s41598-022-18644-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Accepted: 08/17/2022] [Indexed: 12/25/2022] Open
Abstract
A serious global public health emergency emerged late November 2019 in Wuhan City, China, by a new highly pathogenic virus, SARS-CoV-2. The virus evolution spread has been tracked by three developing databases: GISAID, Nextstrain and PANGO to understand its circulating variants. In this study, 110 diagnosed positive COVID-19 patient's samples, were collected from Kasr Al-Aini Hospital and the Children Cancer Hospital Egypt 57357 between May 2020 and January 2021, with clinical severity ranging from mild to severe. The viral genomes were sequenced by next generation sequencing, and phylogenetic analysis was performed to understand viral transmission dynamics. According to Nextstrain clades, most of our sequenced samples belonged to clades 20A and 20D, which in addition to clade 20B were present from the beginning of sample collection in May 2020. Clades 19A and 19B, on the other hand, appeared in the mid and late 2020 respectively, followed by the disappearance of clade 20B at the end of 2020. We identified a relatively high prevalence of the D614G spike protein variant and novel patterns of mutations associated together and with different clades. We also identified four mutations, spike H49Y, ORF3a H78Y, ORF8 E64stop and nucleocapsid E378V, associated with higher disease severity. Altogether, our study contributes genetic, phylogenetic, and clinical correlation data about the spread of the SARS-CoV-2 pandemic in Egypt.
Collapse
|
3
|
McHenry A, Iyer K, Wang J, Liu C, Harigopal M. Detection of SARS-CoV-2 in tissue: the comparative roles of RT-qPCR, in situ RNA hybridization, and immunohistochemistry. Expert Rev Mol Diagn 2022; 22:559-574. [PMID: 35658709 DOI: 10.1080/14737159.2022.2085508] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION The emergence of SARS-CoV-2, the causative agent the COVID-19 pandemic, has led to a rapidly expanding arsenal of molecular diagnostic assays for the detection of viral material in tissue specimens. AREAS COVERED We review the value and shortcomings of available tissue-based assays for SARS-CoV-2 detection in formalin-fixed paraffin-embedded (FFPE) tissue, including immunohistochemistry, in situ hybridization, and quantitative reverse transcription PCR (RT-qPCR). The validation, accuracy, and comparative utility of each method is discussed. Subsequently, we identify commercially available antibodies which render the greatest specificity and reproducibility of staining in FFPE specimens. EXPERT OPINION We offer expert opinion on the efficacy of such techniques and guidance for future implementation, both clinical and experimental.
Collapse
Affiliation(s)
- Austin McHenry
- Yale University School of Medicine, Department of Pathology, New Haven, CT, 06520, United States
| | - Krishna Iyer
- Yale University School of Medicine, Department of Pathology, New Haven, CT, 06520, United States
| | - Jianhi Wang
- Yale University School of Medicine, Department of Pathology, New Haven, CT, 06520, United States
| | - Chen Liu
- Yale University School of Medicine, Department of Pathology, New Haven, CT, 06520, United States
| | - Malini Harigopal
- Yale University School of Medicine, Department of Pathology, New Haven, CT, 06520, United States
| |
Collapse
|
4
|
Lee BT, Barber GP, Benet-Pagès A, Casper J, Clawson H, Diekhans M, Fischer C, Gonzalez JN, Hinrichs A, Lee C, Muthuraman P, Nassar L, Nguy B, Pereira T, Perez G, Raney B, Rosenbloom K, Schmelter D, Speir M, Wick B, Zweig A, Haussler D, Kuhn R, Haeussler M, Kent W. The UCSC Genome Browser database: 2022 update. Nucleic Acids Res 2022; 50:D1115-D1122. [PMID: 34718705 PMCID: PMC8728131 DOI: 10.1093/nar/gkab959] [Citation(s) in RCA: 144] [Impact Index Per Article: 72.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 09/30/2021] [Accepted: 10/04/2021] [Indexed: 11/25/2022] Open
Abstract
The UCSC Genome Browser, https://genome.ucsc.edu, is a graphical viewer for exploring genome annotations. The website provides integrated tools for visualizing, comparing, analyzing, and sharing both publicly available and user-generated genomic datasets. Data highlights this year include a collection of easily accessible public hub assemblies on new organisms, now featuring BLAT alignment and PCR capabilities, and new and updated clinical tracks (gnomAD, DECIPHER, CADD, REVEL). We introduced a new Track Sets feature and enhanced variant displays to aid in the interpretation of clinical data. We also added a tool to rapidly place new SARS-CoV-2 genomes in a global phylogenetic tree enabling researchers to view the context of emerging mutations in our SARS-CoV-2 Genome Browser. Other new software focuses on usability features, including more informative mouseover displays and new fonts.
Collapse
Affiliation(s)
- Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anna Benet-Pagès
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Medical Genetics Center (Medizinisch Genetisches Zentrum), Munich 80335, Germany
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Clay Fischer
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Pranav Muthuraman
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Beagan Nguy
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Tiana Pereira
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Gerardo Perez
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brittney D Wick
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
5
|
Shchur V, Spirin V, Sirotkin D, Burovski E, De Maio N, Corbett-Detig R. VGsim: scalable viral genealogy simulator for global pandemic. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2021:2021.04.21.21255891. [PMID: 33948608 PMCID: PMC8095227 DOI: 10.1101/2021.04.21.21255891] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Accurate simulation of complex biological processes is an essential component of developing and validating new technologies and inference approaches. As an effort to help contain the COVID-19 pandemic, large numbers of SARS-CoV-2 genomes have been sequenced from most regions in the world. More than 5.5 million viral sequences are publicly available as of November 2021. Many studies estimate viral genealogies from these sequences, as these can provide valuable information about the spread of the pandemic across time and space. Additionally such data are a rich source of information about molecular evolutionary processes including natural selection, for example allowing the identification of new variants with transmissibility and immunity evasion advantages. To our knowledge, there is no framework that is both efficient and flexible enough to simulate the pandemic to approximate world-scale scenarios and generate viral genealogies of millions of samples. Here, we introduce a new fast simulator VGsim which addresses the problem of simulation genealogies under epidemiological models. The simulation process is split into two phases. During the forward run the algorithm generates a chain of population-level events reflecting the dynamics of the pandemic using an hierarchical version of the Gillespie algorithm. During the backward run a coalescent-like approach generates a tree genealogy of samples conditioning on the population-level events chain generated during the forward run. Our software can model complex population structure, epistasis and immunity escape. The code is freely available at https://github.com/Genomics-HSE/VGsim.
Collapse
Affiliation(s)
| | | | | | | | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Russell Corbett-Detig
- HSE University, Russian Federation
- Department of Biomolecular Engineering and Genomics Institute, UC Santa Cruz, California 95064
| |
Collapse
|
6
|
Exploiting genomic surveillance to map the spatio-temporal dispersal of SARS-CoV-2 spike mutations in Belgium across 2020. Sci Rep 2021; 11:18580. [PMID: 34535691 PMCID: PMC8448849 DOI: 10.1038/s41598-021-97667-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 08/24/2021] [Indexed: 11/21/2022] Open
Abstract
At the end of 2020, several new variants of SARS-CoV-2—designated variants of concern—were detected and quickly suspected to be associated with a higher transmissibility and possible escape of vaccine-induced immunity. In Belgium, this discovery has motivated the initiation of a more ambitious genomic surveillance program, which is drastically increasing the number of SARS-CoV-2 genomes to analyse for monitoring the circulation of viral lineages and variants of concern. In order to efficiently analyse the massive collection of genomic data that are the result of such increased sequencing efforts, streamlined analytical strategies are crucial. In this study, we illustrate how to efficiently map the spatio-temporal dispersal of target mutations at a regional level. As a proof of concept, we focus on the Belgian province of Liège that has been consistently sampled throughout 2020, but was also one of the main epicenters of the second European epidemic wave. Specifically, we employ a recently developed phylogeographic workflow to infer the regional dispersal history of viral lineages associated with three specific mutations on the spike protein (S98F, A222V and S477N) and to quantify their relative importance through time. Our analytical pipeline enables analysing large data sets and has the potential to be quickly applied and updated to track target mutations in space and time throughout the course of an epidemic.
Collapse
|
7
|
McBroome J, Thornlow B, Hinrichs AS, De Maio N, Goldman N, Haussler D, Corbett-Detig R, Turakhia Y. A daily-updated database and tools for comprehensive SARS-CoV-2 mutation-annotated trees. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021. [PMID: 33821270 PMCID: PMC8020970 DOI: 10.1101/2021.04.03.438321] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The vast scale of SARS-CoV-2 sequencing data has made it increasingly challenging to comprehensively analyze all available data using existing tools and file formats. To address this, we present a database of SARS-CoV-2 phylogenetic trees inferred with unrestricted public sequences, which we update daily to incorporate new sequences. Our database uses the recently-proposed mutation-annotated tree (MAT) format to efficiently encode the tree with branches labeled with parsimony-inferred mutations as well as Nextstrain clade and Pango lineage labels at clade roots. As of June 9, 2021, our SARS-CoV-2 MAT consists of 834,521 sequences and provides a comprehensive view of the virus’ evolutionary history using public data. We also present matUtils – a command-line utility for rapidly querying, interpreting and manipulating the MATs. Our daily-updated SARS-CoV-2 MAT database and matUtils software are available at http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/UShER_SARS-CoV-2/ and https://github.com/yatisht/usher, respectively.
Collapse
Affiliation(s)
- Jakob McBroome
- Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA.,Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA.,Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
| | - David Haussler
- Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA.,Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA.,Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Yatish Turakhia
- Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA.,Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
8
|
Nicholls SM, Poplawski R, Bull MJ, Underwood A, Chapman M, Abu-Dahab K, Taylor B, Colquhoun RM, Rowe WPM, Jackson B, Hill V, O'Toole Á, Rey S, Southgate J, Amato R, Livett R, Gonçalves S, Harrison EM, Peacock SJ, Aanensen DM, Rambaut A, Connor TR, Loman NJ. CLIMB-COVID: continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance. Genome Biol 2021; 22:196. [PMID: 34210356 PMCID: PMC8247108 DOI: 10.1186/s13059-021-02395-y] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 05/28/2021] [Indexed: 12/15/2022] Open
Abstract
In response to the ongoing SARS-CoV-2 pandemic in the UK, the COVID-19 Genomics UK (COG-UK) consortium was formed to rapidly sequence SARS-CoV-2 genomes as part of a national-scale genomic surveillance strategy. The network consists of universities, academic institutes, regional sequencing centres and the four UK Public Health Agencies. We describe the development and deployment of CLIMB-COVID, an encompassing digital infrastructure to address the challenge of collecting and integrating both genomic sequencing data and sample-associated metadata produced across the COG-UK network.
Collapse
Affiliation(s)
- Samuel M Nicholls
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
| | - Radoslaw Poplawski
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
| | - Matthew J Bull
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK
| | - Anthony Underwood
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Oxford Big Data Institute, Old Road Campus, Oxford, UK
| | - Michael Chapman
- Health Data Research UK Cambridge, Wellcome Genome Campus, Hinxton, UK
| | - Khalil Abu-Dahab
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Oxford Big Data Institute, Old Road Campus, Oxford, UK
| | - Ben Taylor
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Oxford Big Data Institute, Old Road Campus, Oxford, UK
| | - Rachel M Colquhoun
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Will P M Rowe
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK
| | - Ben Jackson
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Verity Hill
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Áine O'Toole
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Sara Rey
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK
| | - Joel Southgate
- School of Biosciences, The Sir Martin Evans Building, Cardiff University, Cardiff, UK
| | | | | | | | - Ewan M Harrison
- Wellcome Sanger Institute, Hinxton, UK
- Department of Medicine, University of Cambridge, Cambridge, UK
- Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
| | | | - David M Aanensen
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Oxford Big Data Institute, Old Road Campus, Oxford, UK
| | - Andrew Rambaut
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Thomas R Connor
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK
- School of Biosciences, The Sir Martin Evans Building, Cardiff University, Cardiff, UK
- Quadram Institute, Norwich, UK
| | - Nicholas J Loman
- Institute of Microbiology and Infection, University of Birmingham, Birmingham, UK.
| |
Collapse
|
9
|
De Maio N, Walker CR, Turakhia Y, Lanfear R, Corbett-Detig R, Goldman N. Mutation Rates and Selection on Synonymous Mutations in SARS-CoV-2. Genome Biol Evol 2021; 13:evab087. [PMID: 33895815 PMCID: PMC8135539 DOI: 10.1093/gbe/evab087] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/19/2021] [Indexed: 12/23/2022] Open
Abstract
The COVID-19 pandemic has seen an unprecedented response from the sequencing community. Leveraging the sequence data from more than 140,000 SARS-CoV-2 genomes, we study mutation rates and selective pressures affecting the virus. Understanding the processes and effects of mutation and selection has profound implications for the study of viral evolution, for vaccine design, and for the tracking of viral spread. We highlight and address some common genome sequence analysis pitfalls that can lead to inaccurate inference of mutation rates and selection, such as ignoring skews in the genetic code, not accounting for recurrent mutations, and assuming evolutionary equilibrium. We find that two particular mutation rates, G →U and C →U, are similarly elevated and considerably higher than all other mutation rates, causing the majority of mutations in the SARS-CoV-2 genome, and are possibly the result of APOBEC and ROS activity. These mutations also tend to occur many times at the same genome positions along the global SARS-CoV-2 phylogeny (i.e., they are very homoplasic). We observe an effect of genomic context on mutation rates, but the effect of the context is overall limited. Although previous studies have suggested selection acting to decrease U content at synonymous sites, we bring forward evidence suggesting the opposite.
Collapse
Affiliation(s)
- Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridgeshire, United Kingdom
| | - Conor R Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridgeshire, United Kingdom
- Department of Genetics, University of Cambridge, United Kingdom
| | - Yatish Turakhia
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, USA
- Genomics Institute, University of California, Santa Cruz, California, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz, California, USA
- Genomics Institute, University of California, Santa Cruz, California, USA
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridgeshire, United Kingdom
| |
Collapse
|
10
|
Thornlow B, Hinrichs AS, Jain M, Dhillon N, La S, Kapp JD, Anigbogu I, Cassatt-Johnstone M, McBroome J, Haeussler M, Turakhia Y, Chang T, Olsen HE, Sanford J, Stone M, Vaske O, Bjork I, Akeson M, Shapiro B, Haussler D, Kilpatrick AM, Corbett-Detig R. A new SARS-CoV-2 lineage that shares mutations with known Variants of Concern is rejected by automated sequence repository quality control. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.04.05.438352. [PMID: 33851162 PMCID: PMC8043452 DOI: 10.1101/2021.04.05.438352] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We report a SARS-CoV-2 lineage that shares N501Y, P681H, and other mutations with known variants of concern, such as B.1.1.7. This lineage, which we refer to as B.1.x (COG-UK sometimes references similar samples as B.1.324.1), is present in at least 20 states across the USA and in at least six countries. However, a large deletion causes the sequence to be automatically rejected from repositories, suggesting that the frequency of this new lineage is underestimated using public data. Recent dynamics based on 339 samples obtained in Santa Cruz County, CA, USA suggest that B.1.x may be increasing in frequency at a rate similar to that of B.1.1.7 in Southern California. At present the functional differences between this variant B.1.x and other circulating SARS-CoV-2 variants are unknown, and further studies on secondary attack rates, viral loads, immune evasion and/or disease severity are needed to determine if it poses a public health concern. Nonetheless, given what is known from well-studied circulating variants of concern, it seems unlikely that the lineage could pose larger concerns for human health than many already globally distributed lineages. Our work highlights a need for rapid turnaround time from sequence generation to submission and improved sequence quality control that removes submission bias. We identify promising paths toward this goal.
Collapse
Affiliation(s)
- Bryan Thornlow
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
| | | | - Miten Jain
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
| | - Namrita Dhillon
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
| | - Scott La
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
| | - Joshua D. Kapp
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz
| | - Ikenna Anigbogu
- Department of Biomolecular Engineering, University of California, Santa Cruz
| | | | - Jakob McBroome
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
| | | | - Yatish Turakhia
- Genomics Institute, University of California, Santa Cruz
- Howard Hughes Medical Institute, University of California, Santa Cruz
| | - Terren Chang
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
| | - Hugh E Olsen
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
| | - Jeremy Sanford
- Genomics Institute, University of California, Santa Cruz
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
- Department of Molecular Cellular and Developmental Biology, University of California, Santa Cruz
| | - Michael Stone
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
- Department of Chemistry and Biochemistry, University of California, Santa Cruz
| | - Olena Vaske
- Genomics Institute, University of California, Santa Cruz
- Molecular Diagnostics Laboratory, University of California, Santa Cruz
- Department of Molecular Cellular and Developmental Biology, University of California, Santa Cruz
| | - Isabel Bjork
- Genomics Institute, University of California, Santa Cruz
| | - Mark Akeson
- Department of Biomolecular Engineering, University of California, Santa Cruz
| | - Beth Shapiro
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz
- Howard Hughes Medical Institute, University of California, Santa Cruz
| | - David Haussler
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
- Howard Hughes Medical Institute, University of California, Santa Cruz
| | - A. Marm Kilpatrick
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California, Santa Cruz
- Genomics Institute, University of California, Santa Cruz
| |
Collapse
|
11
|
|
12
|
Folkerts ML, Lemmer D, Pfeiffer A, Vasquez D, French C, Jones A, Nguyen M, Larsen B, Porter WT, Sheridan K, Bowers JR, Engelthaler DM. Methods for sequencing the pandemic: benefits of rapid or high-throughput processing. F1000Res 2021; 10:ISCB Comm J-48. [PMID: 35342619 PMCID: PMC8921685 DOI: 10.12688/f1000research.28352.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/11/2022] [Indexed: 12/21/2022] Open
Abstract
Genomic epidemiology has proven successful for real-time and retrospective monitoring of small and large-scale outbreaks. Here, we report two genomic sequencing and analysis strategies for rapid-turnaround or high-throughput processing of metagenomic samples. The rapid-turnaround method was designed to provide a quick phylogenetic snapshot of samples at the heart of active outbreaks, and has a total turnaround time of <48 hours from raw sample to analyzed data. The high-throughput method, first reported here for SARS-CoV2, was designed for semi-retrospective data analysis, and is both cost effective and highly scalable. Though these methods were developed and utilized for the SARS-CoV-2 pandemic response in Arizona, U.S, we envision their use for infectious disease epidemiology in the 21 st Century.
Collapse
Affiliation(s)
- Megan L. Folkerts
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Darrin Lemmer
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Ashlyn Pfeiffer
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Danielle Vasquez
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Chris French
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Amber Jones
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Marjorie Nguyen
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Brendan Larsen
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - W. Tanner Porter
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Krystal Sheridan
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Jolene R. Bowers
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - David M. Engelthaler
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| |
Collapse
|
13
|
Folkerts ML, Lemmer D, Pfeiffer A, Vasquez D, French C, Jones A, Nguyen M, Larsen B, Porter WT, Sheridan K, Bowers JR, Engelthaler DM. Sequencing the pandemic: rapid and high-throughput processing and analysis of COVID-19 clinical samples for 21 st century public health. F1000Res 2021; 10:ISCB Comm J-48. [PMID: 35342619 PMCID: PMC8921685 DOI: 10.12688/f1000research.28352.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/20/2021] [Indexed: 11/04/2023] Open
Abstract
Genomic epidemiology has proven successful for real-time and retrospective monitoring of small and large-scale outbreaks. Here, we report two genomic sequencing and analysis strategies for rapid-turnaround or high-throughput processing of metagenomic samples. The rapid-turnaround method was designed to provide a quick phylogenetic snapshot of samples at the heart of active outbreaks, and has a total turnaround time of <48 hours from raw sample to analyzed data. The high-throughput method was designed for semi-retrospective data analysis, and is both cost effective and highly scalable. Though these methods were developed and utilized for the SARS-CoV-2 pandemic response in Arizona, U.S, and we envision their use for infectious disease epidemiology in the 21 st Century.
Collapse
Affiliation(s)
- Megan L. Folkerts
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Darrin Lemmer
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Ashlyn Pfeiffer
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Danielle Vasquez
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Chris French
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Amber Jones
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Marjorie Nguyen
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Brendan Larsen
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ, 85721, USA
| | - W. Tanner Porter
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Krystal Sheridan
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - Jolene R. Bowers
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| | - David M. Engelthaler
- Pathogen Genomics Division, Translational Genomics Research Institute, Flagstaff, AZ, 86005, USA
| |
Collapse
|
14
|
De Maio N, Walker CR, Turakhia Y, Lanfear R, Corbett-Detig R, Goldman N. Mutation rates and selection on synonymous mutations in SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.01.14.426705. [PMID: 33469589 PMCID: PMC7814826 DOI: 10.1101/2021.01.14.426705] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The COVID-19 pandemic has seen an unprecedented response from the sequencing community. Leveraging the sequence data from more than 140,000 SARS-CoV-2 genomes, we study mutation rates and selective pressures affecting the virus. Understanding the processes and effects of mutation and selection has profound implications for the study of viral evolution, for vaccine design, and for the tracking of viral spread. We highlight and address some common genome sequence analysis pitfalls that can lead to inaccurate inference of mutation rates and selection, such as ignoring skews in the genetic code, not accounting for recurrent mutations, and assuming evolutionary equilibrium. We find that two particular mutation rates, G→U and C→U, are similarly elevated and considerably higher than all other mutation rates, causing the majority of mutations in the SARS-CoV-2 genome, and are possibly the result of APOBEC and ROS activity. These mutations also tend to occur many times at the same genome positions along the global SARS-CoV-2 phylogeny (i.e., they are very homoplasic). We observe an effect of genomic context on mutation rates, but the effect of the context is overall limited. While previous studies have suggested selection acting to decrease U content at synonymous sites, we bring forward evidence suggesting the opposite.
Collapse
Affiliation(s)
- Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Conor R Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Yatish Turakhia
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| |
Collapse
|