1
|
Levi R, Zerhouni EG, Altuvia S. Predicting the spread of SARS-CoV-2 variants: An artificial intelligence enabled early detection. PNAS NEXUS 2024; 3:pgad424. [PMID: 38170049 PMCID: PMC10759796 DOI: 10.1093/pnasnexus/pgad424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 11/27/2023] [Indexed: 01/05/2024]
Abstract
During more than 3 years since its emergence, SARS-CoV-2 has shown great ability to mutate rapidly into diverse variants, some of which turned out to be very infectious and have spread throughout the world causing waves of infections. At this point, many countries have already experienced up to six waves of infections. Extensive academic work has focused on the development of models to predict the pandemic trajectory based on epidemiological data, but none has focused on predicting variant-specific spread. Moreover, important scientific literature analyzes the genetic evolution of SARS-CoV-2 variants and how it might functionally affect their infectivity. However, genetic attributes have not yet been incorporated into existing epidemiological modeling that aims to capture infection trajectory. Thus, this study leverages variant-specific genetic characteristics together with epidemiological information to systematically predict the future spread trajectory of newly detected variants. The study describes the analysis of 9.0 million SARS-CoV-2 genetic sequences in 30 countries and identifies temporal characteristic patterns of SARS-CoV-2 variants that caused significant infection waves. Using this descriptive analysis, a machine-learning-enabled risk assessment model has been developed to predict, as early as 1 week after their first detection, which variants are likely to constitute the new wave of infections in the following 3 months. The model's out-of-sample area under the curve (AUC) is 86.3% for predictions after 1 week and 90.8% for predictions after 2 weeks. The methodology described in this paper could contribute more broadly to the development of improved predictive models for variants of other infectious viruses.
Collapse
Affiliation(s)
- Retsef Levi
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - El Ghali Zerhouni
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Shoshy Altuvia
- Department of Microbiology and Molecular Genetics, The Hebrew University-Hadassah Medical School, Jerusalem, 9112102, Israel
| |
Collapse
|
2
|
Sinha A, Roy S. Intrinsically Disordered Regions Function as a Cervical Collar to Remotely Regulate the Nodding Dynamics of SARS-CoV-2 Prefusion Spike Heads. J Phys Chem B 2023; 127:8393-8405. [PMID: 37738458 DOI: 10.1021/acs.jpcb.3c05338] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
The SARS-CoV-2 prefusion spike heads (receptor binding domains, RBDs) frequently nod down and up to interact with host cell receptors. As the spike protein is a trimeric unit of significant size, to understand its large-scale structural dynamics associated with the nodding mechanism and the mutational impact on the same, we develop a topological symmetry-information-loaded coarse-grained structure-based model of a spike trimer using recent cryo-EM structural data. Our study reveals the control of two distant intrinsically disordered regions (IDRs), namely, 630 and FPPR loops, over the nodding dynamics of spike heads. We find that the order-disorder transition of IDRs becomes more evident in the variants of concern (VOCs) that are associated with the characteristic mutation, D614G, in the proximity of these IDRs. In some VOCs, the two other mutations A570D and S982A also show an integral effect. The driver mutation D614G instigates a salt-bridge disruption, altering the order-disorder dynamics of both 630 and FPPR loops and their interaction with the C-terminal domains (CTD1/CTD2). This altered connectivity in these mutants allows the two IDRs to act collectively as a "cervical collar" for the RBD, supporting various spike head postures, consistent with cryo-EM results available for specific cases. The IDRs' control over the spike structure and dynamics presents an exciting opportunity where they can be targeted as remote operational switches to artificially maneuver the nod for effective therapeutic interventions.
Collapse
Affiliation(s)
- Anushree Sinha
- Department of Chemical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur 741246, West Bengal, India
| | - Susmita Roy
- Department of Chemical Sciences, Indian Institute of Science Education and Research Kolkata, Mohanpur 741246, West Bengal, India
| |
Collapse
|
3
|
Sinha A, Sangeet S, Roy S. Evolution of Sequence and Structure of SARS-CoV-2 Spike Protein: A Dynamic Perspective. ACS OMEGA 2023; 8:23283-23304. [PMID: 37426203 PMCID: PMC10324094 DOI: 10.1021/acsomega.3c00944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2023] [Accepted: 06/01/2023] [Indexed: 07/11/2023]
Abstract
Novel coronavirus (SARS-CoV-2) enters its host cell through a surface spike protein. The viral spike protein has undergone several modifications/mutations at the genomic level, through which it modulated its structure-function and passed through several variants of concern. Recent advances in high-resolution structure determination and multiscale imaging techniques, cost-effective next-generation sequencing, and development of new computational methods (including information theory, statistical methods, machine learning, and many other artificial intelligence-based techniques) have hugely contributed to the characterization of sequence, structure, function of spike proteins, and its different variants to understand viral pathogenesis, evolutions, and transmission. Laying on the foundation of the sequence-structure-function paradigm, this review summarizes not only the important findings on structure/function but also the structural dynamics of different spike components, highlighting the effects of mutations on them. As dynamic fluctuations of three-dimensional spike structure often provide important clues for functional modulation, quantifying time-dependent fluctuations of mutational events over spike structure and its genetic/amino acidic sequence helps identify alarming functional transitions having implications for enhanced fusogenicity and pathogenicity of the virus. Although these dynamic events are more difficult to capture than quantifying a static, average property, this review encompasses those challenging aspects of characterizing the evolutionary dynamics of spike sequence and structure and their implications for functions.
Collapse
|
4
|
Gigante G, Giuliani A. Reconstruction of the Temporal Correlation Network of All-Cause Mortality Fluctuation across Italian Regions: The Importance of Temperature and Among-Nodes Flux. ENTROPY (BASEL, SWITZERLAND) 2022; 25:21. [PMID: 36673162 PMCID: PMC9858294 DOI: 10.3390/e25010021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/19/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
All-cause mortality is a very coarse grain, albeit very reliable, index to check the health implications of lifestyle determinants, systemic threats and socio-demographic factors. In this work, we adopt a statistical-mechanics approach to the analysis of temporal fluctuations of all-cause mortality, focusing on the correlation structure of this index across different regions of Italy. The correlation network among the 20 Italian regions was reconstructed using temperature oscillations and traveller flux (as a function of distance and region's attractiveness, based on GDP), allowing for a separation between infective and non-infective death causes. The proposed approach allows monitoring of emerging systemic threats in terms of anomalies of correlation network structure.
Collapse
Affiliation(s)
- Guido Gigante
- Radiation Protection and Computational Physics, Istituto Superiore di Sanità, 00161 Rome, Italy
| | - Alessandro Giuliani
- Environment and Health Department, Istituto Superiore di Sanità, 00161 Rome, Italy
| |
Collapse
|
5
|
Saldivar-Espinoza B, Macip G, Garcia-Segura P, Mestres-Truyol J, Puigbò P, Cereto-Massagué A, Pujadas G, Garcia-Vallve S. Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks. Int J Mol Sci 2022; 23:ijms232314683. [PMID: 36499005 PMCID: PMC9736107 DOI: 10.3390/ijms232314683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 11/18/2022] [Accepted: 11/22/2022] [Indexed: 11/26/2022] Open
Abstract
Predicting SARS-CoV-2 mutations is difficult, but predicting recurrent mutations driven by the host, such as those caused by host deaminases, is feasible. We used machine learning to predict which positions from the SARS-CoV-2 genome will hold a recurrent mutation and which mutations will be the most recurrent. We used data from April 2021 that we separated into three sets: a training set, a validation set, and an independent test set. For the test set, we obtained a specificity value of 0.69, a sensitivity value of 0.79, and an Area Under the Curve (AUC) of 0.8, showing that the prediction of recurrent SARS-CoV-2 mutations is feasible. Subsequently, we compared our predictions with updated data from January 2022, showing that some of the false positives in our prediction model become true positives later on. The most important variables detected by the model's Shapley Additive exPlanation (SHAP) are the nucleotide that mutates and RNA reactivity. This is consistent with the SARS-CoV-2 mutational bias pattern and the preference of some host deaminases for specific sequences and RNA secondary structures. We extend our investigation by analyzing the mutations from the variants of concern Alpha, Beta, Delta, Gamma, and Omicron. Finally, we analyzed amino acid changes by looking at the predicted recurrent mutations in the M-pro and spike proteins.
Collapse
Affiliation(s)
- Bryan Saldivar-Espinoza
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Guillem Macip
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Pol Garcia-Segura
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Júlia Mestres-Truyol
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Pere Puigbò
- Department of Biology, University of Turku, 20500 Turku, Finland
- Department of Biochemistry and Biotechnology, Rovira i Virgili University, 43007 Tarragona, Spain
- Nutrition and Health Unit, Eurecat Technology Centre of Catalonia, 43204 Reus, Spain
| | - Adrià Cereto-Massagué
- EURECAT Centre Tecnològic de Catalunya, Centre for Omic Sciences (COS), Joint Unit Universitat Rovira i Virgili-EURECAT, Unique Scientific and Technical Infrastructures (ICTS), 43204 Reus, Spain
| | - Gerard Pujadas
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
| | - Santiago Garcia-Vallve
- Research Group in Cheminformatics & Nutrition, Departament de Bioquímica i Biotecnologia, Campus de Sescelades, Universitat Rovira i Virgili, 43007 Tarragona, Spain
- Correspondence:
| |
Collapse
|