1
|
Wang D, Huot M, Mohanty V, Shakhnovich EI. Biophysical principles predict fitness of SARS-CoV-2 variants. Proc Natl Acad Sci U S A 2024; 121:e2314518121. [PMID: 38820002 PMCID: PMC11161772 DOI: 10.1073/pnas.2314518121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 04/19/2024] [Indexed: 06/02/2024] Open
Abstract
SARS-CoV-2 employs its spike protein's receptor binding domain (RBD) to enter host cells. The RBD is constantly subjected to immune responses, while requiring efficient binding to host cell receptors for successful infection. However, our understanding of how RBD's biophysical properties contribute to SARS-CoV-2's epidemiological fitness remains largely incomplete. Through a comprehensive approach, comprising large-scale sequence analysis of SARS-CoV-2 variants and the identification of a fitness function based on binding thermodynamics, we unravel the relationship between the biophysical properties of RBD variants and their contribution to viral fitness. We developed a biophysical model that uses statistical mechanics to map the molecular phenotype space, characterized by dissociation constants of RBD to ACE2, LY-CoV016, LY-CoV555, REGN10987, and S309, onto an epistatic fitness landscape. We validate our findings through experimentally measured and machine learning (ML) estimated binding affinities, coupled with infectivity data derived from population-level sequencing. Our analysis reveals that this model effectively predicts the fitness of novel RBD variants and can account for the epistatic interactions among mutations, including explaining the later reversal of Q493R. Our study sheds light on the impact of specific mutations on viral fitness and delivers a tool for predicting the future epidemiological trajectory of previously unseen or emerging low-frequency variants. These insights offer not only greater understanding of viral evolution but also potentially aid in guiding public health decisions in the battle against COVID-19 and future pandemics.
Collapse
Affiliation(s)
- Dianzhuo Wang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Marian Huot
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- École Polytechnique, Institut Polytechnique de Paris, Palaiseau91128, France
| | - Vaibhav Mohanty
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- Harvard/MIT MD-PhD Program, Harvard Medical School, Boston, MA02115
- Massachusetts Institute of Technology, Cambridge, MA02139
| | - Eugene I. Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
| |
Collapse
|
2
|
Faraji N, Zeinali T, Joukar F, Aleali MS, Eslami N, Shenagari M, Mansour-Ghanaei F. Mutational dynamics of SARS-CoV-2: Impact on future COVID-19 vaccine strategies. Heliyon 2024; 10:e30208. [PMID: 38707429 PMCID: PMC11066641 DOI: 10.1016/j.heliyon.2024.e30208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2023] [Revised: 04/18/2024] [Accepted: 04/22/2024] [Indexed: 05/07/2024] Open
Abstract
The rapid emergence of multiple strains of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) has sparked profound concerns regarding the ongoing evolution of the virus and its potential impact on global health. Classified by the World Health Organization (WHO) as variants of concern (VOC), these strains exhibit heightened transmissibility and pathogenicity, posing significant challenges to existing vaccine strategies. Despite widespread vaccination efforts, the continual evolution of SARS-CoV-2 variants presents a formidable obstacle to achieving herd immunity. Of particular concern is the coronavirus spike (S) protein, a pivotal viral surface protein crucial for host cell entry and infectivity. Mutations within the S protein have been shown to enhance transmissibility and confer resistance to antibody-mediated neutralization, undermining the efficacy of traditional vaccine platforms. Moreover, the S protein undergoes rapid molecular evolution under selective immune pressure, leading to the emergence of diverse variants with distinct mutation profiles. This review underscores the urgent need for vigilance and adaptation in vaccine development efforts to combat the evolving landscape of SARS-CoV-2 mutations and ensure the long-term effectiveness of global immunization campaigns.
Collapse
Affiliation(s)
- Niloofar Faraji
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Tahereh Zeinali
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Farahnaz Joukar
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Maryam Sadat Aleali
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Narges Eslami
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| | - Mohammad Shenagari
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
- Department of Microbiology, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran
| | - Fariborz Mansour-Ghanaei
- Gastrointestinal and Liver Diseases Research Center, Guilan University of Medical Sciences, Rasht, Iran
| |
Collapse
|
3
|
Zhang Y, Sun H, Zhang W, Fu T, Huang S, Mou M, Zhang J, Gao J, Ge Y, Yang Q, Zhu F. CellSTAR: a comprehensive resource for single-cell transcriptomic annotation. Nucleic Acids Res 2024; 52:D859-D870. [PMID: 37855686 PMCID: PMC10767908 DOI: 10.1093/nar/gkad874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/12/2023] [Accepted: 09/27/2023] [Indexed: 10/20/2023] Open
Abstract
Large-scale studies of single-cell sequencing and biological experiments have successfully revealed expression patterns that distinguish different cell types in tissues, emphasizing the importance of studying cellular heterogeneity and accurately annotating cell types. Analysis of gene expression profiles in these experiments provides two essential types of data for cell type annotation: annotated references and canonical markers. In this study, the first comprehensive database of single-cell transcriptomic annotation resource (CellSTAR) was thus developed. It is unique in (a) offering the comprehensive expertly annotated reference data for annotating hundreds of cell types for the first time and (b) enabling the collective consideration of reference data and marker genes by incorporating tens of thousands of markers. Given its unique features, CellSTAR is expected to attract broad research interests from the technological innovations in single-cell transcriptomics, the studies of cellular heterogeneity & dynamics, and so on. It is now publicly accessible without any login requirement at: https://idrblab.org/cellstar.
Collapse
Affiliation(s)
- Ying Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Huaicheng Sun
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Wei Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Shijie Huang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jinsong Zhang
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Jianqing Gao
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
| | - Yichao Ge
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
- Department of Bioinformatics, School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| |
Collapse
|
4
|
Ramachandran A, Lumetta SS, Chen D. PandoGen: Generating complete instances of future SARS-CoV-2 sequences using Deep Learning. PLoS Comput Biol 2024; 20:e1011790. [PMID: 38241392 PMCID: PMC10829978 DOI: 10.1371/journal.pcbi.1011790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 01/31/2024] [Accepted: 12/27/2023] [Indexed: 01/21/2024] Open
Abstract
One of the challenges in a viral pandemic is the emergence of novel variants with different phenotypical characteristics. An ability to forecast future viral individuals at the sequence level enables advance preparation by characterizing the sequences and closing vulnerabilities in current preventative and therapeutic methods. In this article, we explore, in the context of a viral pandemic, the problem of generating complete instances of undiscovered viral protein sequences, which have a high likelihood of being discovered in the future using protein language models. Current approaches to training these models fit model parameters to a known sequence set, which does not suit pandemic forecasting as future sequences differ from known sequences in some respects. To address this, we develop a novel method, called PandoGen, to train protein language models towards the pandemic protein forecasting task. PandoGen combines techniques such as synthetic data generation, conditional sequence generation, and reward-based learning, enabling the model to forecast future sequences, with a high propensity to spread. Applying our method to modeling the SARS-CoV-2 Spike protein sequence, we find empirically that our model forecasts twice as many novel sequences with five times the case counts compared to a model that is 30× larger. Our method forecasts unseen lineages months in advance, whereas models 4× and 30× larger forecast almost no new lineages. When trained on data available up to a month before the onset of important Variants of Concern, our method consistently forecasts sequences belonging to those variants within tight sequence budgets.
Collapse
Affiliation(s)
- Anand Ramachandran
- University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Steven S. Lumetta
- University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| | - Deming Chen
- University of Illinois at Urbana-Champaign, Urbana, Illinois, United States of America
| |
Collapse
|