1
|
Li B, Guo J, Hu W, Chen Y. Binding affinity improvement analysis of multiple-mutant Omicron on 2019-nCov to human ACE2 by in silico predictions. J Mol Model 2023; 29:155. [PMID: 37093365 PMCID: PMC10123576 DOI: 10.1007/s00894-023-05536-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 03/01/2023] [Indexed: 04/25/2023]
Abstract
CONTEXT Since the outbreak of COVID-19 in 2019, the 2019-nCov coronavirus has appeared diverse mutational characteristics due to its own flexible conformation. One multiple-mutant strain (Omicron) with surprisingly infective activity outburst, and affected the biological activities of current drugs and vaccines, making the epidemic significantly difficult to prevent and control, and seriously threaten health around the world. Importunately exploration of mutant characteristics for novel coronavirus Omicron can supply strong theoretical guidance for learning binding mechanism of mutant viruses. What's more, full acknowledgement of key mutated-residues on Omicron strain can provide new methodology of the novel pathogenic mechanism to human ACE2 receptor, as well as the subsequent vaccine development. METHODS In this research, 3D structures of 32 single-point mutations of 2019-nCov were firstly constructed, and 32-sites multiple-mutant Omicron were finally obtained based one the wild-type virus by homology modeling method. One total number of 33 2019-nCov/ACE2 complex systems were acquired by protein-protein docking, and optimized by using preliminary molecular dynamic simulations. Binding free energies between each 2019-nCov mutation system and human ACE2 receptor were calculated, and corresponding binding patterns especially the regions adjacent to mutation site were analyzed. The results indicated that one total number of 6 mutated sites on the Omicron strain played crucial role in improving binding capacities from 2019-nCov to ACE2 protein. Subsequently, we performed long-term molecular dynamic simulations and protein-protein binding energy analysis for the selected 6 mutations. 3 infected individuals, the mutants T478K, Q493R and G496S with lower binding energies -66.36, -67.98 and -67.09 kcal/mol also presents the high infectivity. These findings indicated that the 3 mutations T478K, Q493R and G496S play the crucial roles in enhancing binding affinity of Omicron to human ACE2 protein. All these results illuminate important theoretical guidance for future virus detection of the Omicron epidemic, drug research and vaccine development.
Collapse
Affiliation(s)
- Bo Li
- School of Chemistry and Environmental Engineering, Wuhan Institute of Technology, Wuhan, 430205, China
| | - Jindan Guo
- NHC Key Laboratory of Human Disease Comparative Medicine, Beijing Engineering Research Center for Experimental Animal Models of Human Critical Diseases, Institute of Laboratory Animal Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100021, China
| | - Wenxiang Hu
- School of Chemistry and Environmental Engineering, Wuhan Institute of Technology, Wuhan, 430205, China.
| | - Yubao Chen
- NHC Key Laboratory of Human Disease Comparative Medicine, Beijing Engineering Research Center for Experimental Animal Models of Human Critical Diseases, Institute of Laboratory Animal Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, 100021, China.
| |
Collapse
|
2
|
Khrustalev VV, Khrustaleva TA, Popinako AV. Germline mutations directions are different between introns of the same gene: case study of the gene coding for amyloid-beta precursor protein. Genetica 2023; 151:61-73. [PMID: 36129589 DOI: 10.1007/s10709-022-00166-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Accepted: 09/08/2022] [Indexed: 02/01/2023]
Abstract
Amyloid-beta precursor protein (APP) is highly conserved in mammals. This feature allowed us to compare nucleotide usage biases in fourfold degenerated sites along the length of its coding region for 146 species of mammals and birds in search of fragments with significant deviations. Even though cytosine usage has the highest value in fourfold degenerated sites in APP coding region from all tested placental mammals, in contrast to marsupial mammals with the bias toward thymine usage, the most frequent germline and somatic mutations in human APP coding region are C to T and G to A transitions. The same mutational AT-pressure is characteristic for germline mutations in introns of human APP gene. However, surprisingly, there are several exceptional introns with deviations in germline mutations rates. The most of those introns surround exons with exceptional biases in nucleotide usage in fourfold degenerated sites. Existence of such fragments in exons 4 and 5, as well as in exon 14, can be connected with the presence of lncRNA genes in complementary strand of DNA. Exceptional nucleotide usage bias in exons 16 and 17 that contain a sequence encoding amyloid-beta peptides can be explained either by the presence of yet unmapped lncRNA(s), or by the autonomous expression of a short mRNA that encodes just C-terminal part of the APP providing an alternative source of amyloid-beta peptides. This hypothesis is supported by the increased rate of T to C transitions in introns 16-17 and 17-18 of Human APP gene relatively to other introns.
Collapse
Affiliation(s)
| | | | - Anna Vladimirovna Popinako
- Bach Institute of Biochemistry, Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russian Federation
| |
Collapse
|
3
|
Li G, Shi L, Zhang L, Xu B. Componential usage patterns in dengue 4 viruses reveal their better evolutionary adaptation to humans. Front Microbiol 2022; 13:935678. [PMID: 36204606 PMCID: PMC9530264 DOI: 10.3389/fmicb.2022.935678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 08/22/2022] [Indexed: 11/15/2022] Open
Abstract
There have been at least four types of dengue outbreaks in the past few years. The evolutionary characteristics of dengue viruses have aroused great concerns. The evolutionary characteristics of dengue 4 viruses are studied in the present study based on their base usage patterns and codon usage patterns. The effective number of codons and relative synonymous codon usage (RSCU) values of four types of dengue viruses were counted or calculated. The Kullback–Leibler (K–L) divergences of relative synonymous codon usage from dengue viruses to humans and the Kullback–Leibler divergences of amino acid usage patterns from dengue viruses to humans were calculated to explore the adaptation levels of dengue viruses. The results suggested that: (1) codon adaptation in dengue 4 viruses occurred through an evolutionary process from 1956 to 2021, (2) overall relative synonymous codon usage values of dengue 4 viruses showed more similarities to humans than those of other subtypes of dengue viruses, and (3) the smaller Kullback–Leibler divergence of amino acid usage and relative synonymous codon usage from dengue viruses to humans indicated that the dengue 4 viruses adapted to human hosts better. All results indicated that both mutation pressure and natural selection pressure contributed to the codon usage pattern of dengue 4 viruses more obvious than to other subtypes of dengue viruses and that the dengue 4 viruses adapted to human hosts better than other types of dengue viruses during their evolutionary process.
Collapse
Affiliation(s)
- Gun Li
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
- *Correspondence: Gun Li
| | - Liang Shi
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
- Key Laboratory of Analytical Chemistry for Life Science of Shaanxi Province, School of Chemistry and Chemical Engineering, Shaanxi Normal University, Xi'an, China
- Liang Shi
| | - Liang Zhang
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
| | - Bingyi Xu
- Laboratory for Biodiversity Science, Department of Biomedical Engineering, School of Electronic Information Engineering, Xi'an Technological University, Xi'an, China
| |
Collapse
|
4
|
Jiang Y, Zhang H, Yu J, Huang D, Zhai L, Li M, Wang Y, Ren Z, Zou L, Zheng Z, Hu H, Zhang J, Zhang B, Zhao W, Yang X, Li B, Shen C. Humoral immune response to authentic circulating SARS-CoV-2 variants elicited by booster vaccination with distinct RBD subunits in mice. J Med Virol 2022; 94:4533-4538. [PMID: 35614018 PMCID: PMC9347575 DOI: 10.1002/jmv.27882] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 05/11/2022] [Accepted: 05/23/2022] [Indexed: 11/15/2022]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) variants could induce immune escape by mutations of the spike protein which are threatening to weaken vaccine efficacy. A booster vaccination is expected to increase the humoral immune response against SARS‐CoV‐2 variants in the population. We showed that immunization with two doses of wild type receptor‐binding domain (RBD) protein, and booster vaccination with wild type or variant RBD protein all significantly increased binding and neutralizing antibody titers against wild type SARS‐CoV‐2 and its variants in mice. Only the booster immunization by Omicron (BA.1)RBD induced a strong antibody titer against the omicron virus strain and comparable antibody titers against all the other virus strains. These findings might shed the light on coronavirus disease 2019 booster immunogens.
Collapse
Affiliation(s)
- Yushan Jiang
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Huan Zhang
- Provincial Center for Disease Control and Prevention, Guangzhou, China, No.160 Quanxian Road, Dashi Street, Panyu District, 511430
| | - Jianhai Yu
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Dong Huang
- Provincial Center for Disease Control and Prevention, Guangzhou, China, No.160 Quanxian Road, Dashi Street, Panyu District, 511430
| | - Linlin Zhai
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Mengjun Li
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Yuelin Wang
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Zuning Ren
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Lirong Zou
- Provincial Center for Disease Control and Prevention, Guangzhou, China, No.160 Quanxian Road, Dashi Street, Panyu District, 511430
| | - Zhonghua Zheng
- Provincial Center for Disease Control and Prevention, Guangzhou, China, No.160 Quanxian Road, Dashi Street, Panyu District, 511430
| | - Huanyu Hu
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | | | - Bao Zhang
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Wei Zhao
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Xingfen Yang
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| | - Baisheng Li
- Provincial Center for Disease Control and Prevention, Guangzhou, China, No.160 Quanxian Road, Dashi Street, Panyu District, 511430
| | - Chenguang Shen
- BSL-3 Laboratory(Guangdong), Guangdong Provincial Key Laboratory of Tropical Disease Research, School of Public Health, Southern Medical University, Guangzhou, China, 510515
| |
Collapse
|
5
|
Chazal N. Coronavirus, the King Who Wanted More Than a Crown: From Common to the Highly Pathogenic SARS-CoV-2, Is the Key in the Accessory Genes? Front Microbiol 2021; 12:682603. [PMID: 34335504 PMCID: PMC8317507 DOI: 10.3389/fmicb.2021.682603] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/22/2021] [Indexed: 12/14/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), that emerged in late 2019, is the etiologic agent of the current "coronavirus disease 2019" (COVID-19) pandemic, which has serious health implications and a significant global economic impact. Of the seven human coronaviruses, all of which have a zoonotic origin, the pandemic SARS-CoV-2, is the third emerging coronavirus, in the 21st century, highly pathogenic to the human population. Previous human coronavirus outbreaks (SARS-CoV-1 and MERS-CoV) have already provided several valuable information on some of the common molecular and cellular mechanisms of coronavirus infections as well as their origin. However, to meet the new challenge caused by the SARS-CoV-2, a detailed understanding of the biological specificities, as well as knowledge of the origin are crucial to provide information on viral pathogenicity, transmission and epidemiology, and to enable strategies for therapeutic interventions and drug discovery. Therefore, in this review, we summarize the current advances in SARS-CoV-2 knowledges, in light of pre-existing information of other recently emerging coronaviruses. We depict the specificity of the immune response of wild bats and discuss current knowledge of the genetic diversity of bat-hosted coronaviruses that promotes viral genome expansion (accessory gene acquisition). In addition, we describe the basic virology of coronaviruses with a special focus SARS-CoV-2. Finally, we highlight, in detail, the current knowledge of genes and accessory proteins which we postulate to be the major keys to promote virus adaptation to specific hosts (bat and human), to contribute to the suppression of immune responses, as well as to pathogenicity.
Collapse
Affiliation(s)
- Nathalie Chazal
- Institut de Recherche en Infectiologie de Montpellier (IRIM), Université de Montpellier, CNRS, Montpellier, France
| |
Collapse
|
6
|
The Long-Term Evolutionary History of Gradual Reduction of CpG Dinucleotides in the SARS-CoV-2 Lineage. BIOLOGY 2021; 10:biology10010052. [PMID: 33445785 PMCID: PMC7828247 DOI: 10.3390/biology10010052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/29/2020] [Accepted: 01/09/2021] [Indexed: 12/24/2022]
Abstract
Simple Summary Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the coronavirus disease 2019 (COVID-19), a pandemic that infected over 81 million people worldwide. This has led the scientific community to characterize the genome of this virus, including its nucleotide composition. Investigation of the dinucleotide frequency revealed that the proportion of CG dinucleotides (CpG) is highly reduced in the viral genomes. Since CpG dinucleotides is the target site for the host antiviral zinc finger protein, it has been suggested that the reduction in the proportion of CpG is the viral response to escape from the host defense machinery. In the present study, we investigated the time of origin of reduction in the CpG content. Whole genome analyses based on all representative viral genomes of the group Betacoronavirus revealed that the CpG content in the lineage of SARS-CoV-2 has been progressively declining over the past 1213 years. The depletion of CpG was found to occur at neutral—as well as selectively constrained—positions of the viral genomes. Abstract Recent studies suggested that the fraction of CG dinucleotides (CpG) is severely reduced in the genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The CpG deficiency was predicted to be the adaptive response of the virus to evade degradation of the viral RNA by the antiviral zinc finger protein that specifically binds to CpG nucleotides. By comparing all representative genomes belonging to the genus Betacoronavirus, this study examined the potential time of origin of CpG depletion. The results of this investigation revealed a highly significant correlation between the proportions of CpG nucleotide (CpG content) of the betacoronavirus species and their times of divergence from SARS-CoV-2. Species that are distantly related to SARS-CoV-2 had much higher CpG contents than that of SARS-CoV-2. Conversely, closely related species had low CpG contents that are similar to or slightly higher than that of SARS-CoV-2. These results suggest a systematic and continuous reduction in the CpG content in the SARS-CoV-2 lineage that might have started since the Sarbecovirus + Hibecovirus clade separated from Nobecovirus, which was estimated to be 1213 years ago. This depletion was not found to be mediated by the GC contents of the genomes. Our results also showed that the depletion of CpG occurred at neutral positions of the genome as well as those under selection. The latter is evident from the progressive reduction in the proportion of arginine amino acid (coded by CpG dinucleotides) in the SARS-CoV-2 lineage over time. The results of this study suggest that shedding CpG nucleotides from their genome is a continuing process in this viral lineage, potentially to escape from their host defense mechanisms.
Collapse
|
7
|
Potential therapeutic approaches of microRNAs for COVID-19: Challenges and opportunities. J Oral Biol Craniofac Res 2020; 11:132-137. [PMID: 33398242 PMCID: PMC7772998 DOI: 10.1016/j.jobcr.2020.12.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 12/23/2020] [Indexed: 02/08/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) emerges as current outbreak cause by Novel Severe Acute Respiratory Syndrome Corona Virus-2 (SARS-CoV-2). This infection affects respiratory system and provides uncontrolled systemic inflammatory response as cytokine storm. The main concern about SARS-CoV-2 pandemic is high viral pathogenicity with no specific drugs. MicroRNAs (miRs) as small non-coding RNAs (21–25 nt) regulate gene expression. The SARS-CoV-2 encoded-miRs affect human genes that involved in transcription, translation, apoptosis, immune response and inflammation. Also, they alter self-gene regulation and hijacked host miRs that provide protective environment to maintain its latency. On the other hand, Host miRs play critical role in viral gene expression to restrict infection. Over expression/inhibition of miRs might result in cell cycle irregularity, impaired immune response or cancer. In this manner, exact role of each miR should be specified. Mimic encoded-miRs like antagomirs showed successful result in phases of clinical trial prevent from negative effects of viral encoded-miRs. Products of mimic miRs are inexpensive corresponds to synthesis of primer; they are short and nanoscale in size. Although SARS-CoV-2 genome is undergoing evaluation, detection of exact molecular pathogenesis open up opportunities to for vaccine development. Salivaomics can evaluate SARS-CoV-2 genome, transcriptome, proteome and biomarkers like miRs in oral related and cancer disease. In this review, we studied the challenge and opportunities of miRs in therapeutic approach for SARS-CoV-2 infection, then overviewed the role of miRs in saliva droplet during SARS-CoV-2 infection and related cancer.
Collapse
|
8
|
Wei Y, Silke JR, Aris P, Xia X. Coronavirus genomes carry the signatures of their habitats. PLoS One 2020; 15:e0244025. [PMID: 33351847 PMCID: PMC7755226 DOI: 10.1371/journal.pone.0244025] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 12/01/2020] [Indexed: 12/15/2022] Open
Abstract
Coronaviruses such as SARS-CoV-2 regularly infect host tissues that express antiviral proteins (AVPs) in abundance. Understanding how they evolve to adapt or evade host immune responses is important in the effort to control the spread of infection. Two AVPs that may shape viral genomes are the zinc finger antiviral protein (ZAP) and the apolipoprotein B mRNA editing enzyme-catalytic polypeptide-like 3 (APOBEC3). The former binds to CpG dinucleotides to facilitate the degradation of viral transcripts while the latter frequently deaminates C into U residues which could generate notable viral sequence variations. We tested the hypothesis that both APOBEC3 and ZAP impose selective pressures that shape the genome of an infecting coronavirus. Our investigation considered a comprehensive number of publicly available genomes for seven coronaviruses (SARS-CoV-2, SARS-CoV, and MERS infecting Homo sapiens, Bovine CoV infecting Bos taurus, MHV infecting Mus musculus, HEV infecting Sus scrofa, and CRCoV infecting Canis lupus familiaris). We show that coronaviruses that regularly infect tissues with abundant AVPs have CpG-deficient and U-rich genomes; whereas those that do not infect tissues with abundant AVPs do not share these sequence hallmarks. Among the coronaviruses surveyed herein, CpG is most deficient in SARS-CoV-2 and a temporal analysis showed a marked increase in C to U mutations over four months of SARS-CoV-2 genome evolution. Furthermore, the preferred motifs in which these C to U mutations occur are the same as those subjected to APOBEC3 editing in HIV-1. These results suggest that both ZAP and APOBEC3 shape the SARS-CoV-2 genome: ZAP imposes a strong CpG avoidance, and APOBEC3 constantly edits C to U. Evolutionary pressures exerted by host immune systems onto viral genomes may motivate novel strategies for SARS-CoV-2 vaccine development.
Collapse
Affiliation(s)
- Yulong Wei
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Jordan R. Silke
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Parisa Aris
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
- Ottawa Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
9
|
Turakhia Y, De Maio N, Thornlow B, Gozashti L, Lanfear R, Walker CR, Hinrichs AS, Fernandes JD, Borges R, Slodkowicz G, Weilguny L, Haussler D, Goldman N, Corbett-Detig R. Stability of SARS-CoV-2 phylogenies. PLoS Genet 2020; 16:e1009175. [PMID: 33206635 PMCID: PMC7721162 DOI: 10.1371/journal.pgen.1009175] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 12/07/2020] [Accepted: 10/06/2020] [Indexed: 12/23/2022] Open
Abstract
The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https://virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse.
Collapse
Affiliation(s)
- Yatish Turakhia
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
| | - Nicola De Maio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Bryan Thornlow
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
| | - Landen Gozashti
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, United States of America
| | - Robert Lanfear
- Department of Ecology and Evolution, Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Conor R. Walker
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge, United Kingdom
| | - Angie S. Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
| | - Jason D. Fernandes
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA, United States of America
| | - Rui Borges
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | - Greg Slodkowicz
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | - Lukas Weilguny
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - David Haussler
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Howard Hughes Medical Institute, University of California, Santa Cruz, CA, United States of America
| | - Nick Goldman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge, United Kingdom
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, United States of America
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, United States of America
| |
Collapse
|