1
|
Matson RP, Comba IY, Silvert E, Niesen MJM, Murugadoss K, Patwardhan D, Suratekar R, Goel EG, Poelaert BJ, Wan KK, Brimacombe KR, Venkatakrishnan AJ, Soundararajan V. A deep learning approach predicting the activity of COVID-19 therapeutics and vaccines against emerging variants. NPJ Syst Biol Appl 2024; 10:138. [PMID: 39604453 PMCID: PMC11603192 DOI: 10.1038/s41540-024-00471-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 11/09/2024] [Indexed: 11/29/2024] Open
Abstract
Understanding which viral variants evade neutralization is crucial for improving antibody-based treatments, especially with rapidly evolving viruses like SARS-CoV-2. Yet, conventional assays are labor intensive and cannot capture the full spectrum of variants. We present a deep learning approach to predict changes in neutralizing antibody activity of COVID-19 therapeutics and vaccine-elicited sera/plasma against emerging viral variants. Our approach leverages data of 67,885 unique SARS-CoV-2 Spike sequences and 7,069 in vitro assays. The resulting model accurately predicted fold changes in neutralizing activity (R2 = 0.77) for a test set (N = 980) of data collected up to eight months after the training data. Next, the model was used to predict changes in activity of current therapeutic and vaccine-induced antibodies against emerging SARS-CoV-2 lineages. Consistent with other work, we found significantly reduced activity against newer XBB descendants, notably EG.5, FL.1.5.1, and XBB.1.16; primarily attributed to the F456L spike mutation.
Collapse
Affiliation(s)
| | - Isin Y Comba
- nference, Cambridge, MA, 02139, USA
- Division of Public Health, Infectious Diseases and Occupational Medicine, Mayo Clinic Rochester, Rochester, NY, 55905, USA
| | | | | | | | | | | | | | - Brittany J Poelaert
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, USA
| | - Kanny K Wan
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, USA
| | - Kyle R Brimacombe
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, USA
| | | | - Venky Soundararajan
- nference, Cambridge, MA, 02139, USA.
- nference Labs, Bengaluru, Karnataka, 560017, India.
| |
Collapse
|
2
|
Mostefai F, Grenier JC, Poujol R, Hussin J. Refining SARS-CoV-2 intra-host variation by leveraging large-scale sequencing data. NAR Genom Bioinform 2024; 6:lqae145. [PMID: 39534500 PMCID: PMC11555433 DOI: 10.1093/nargab/lqae145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 09/13/2024] [Accepted: 10/18/2024] [Indexed: 11/16/2024] Open
Abstract
Understanding viral genome evolution during host infection is crucial for grasping viral diversity and evolution. Analyzing intra-host single nucleotide variants (iSNVs) offers insights into new lineage emergence, which is important for predicting and mitigating future viral threats. Despite next-generation sequencing's potential, challenges persist, notably sequencing artifacts leading to false iSNVs. We developed a workflow to enhance iSNV detection in large NGS libraries, using over 130 000 SARS-CoV-2 libraries to distinguish mutations from errors. Our approach integrates bioinformatics protocols, stringent quality control, and dimensionality reduction to tackle batch effects and improve mutation detection reliability. Additionally, we pioneer the application of the PHATE visualization approach to genomic data and introduce a methodology that quantifies how related groups of data points are represented within a two-dimensional space, enhancing clustering structure explanation based on genetic similarities. This workflow advances accurate intra-host mutation detection, facilitating a deeper understanding of viral diversity and evolution.
Collapse
Affiliation(s)
- Fatima Mostefai
- Département de Biochimie et de Médecine Moléculaire, Université de Montréal, Québec, Canada
- Research Center, Montreal Heart Institute, Québec, Canada
- Mila - Quebec AI Institute, Université de Montréal, Québec, Canada
| | | | - Raphaël Poujol
- Research Center, Montreal Heart Institute, Québec, Canada
| | - Julie Hussin
- Département de Biochimie et de Médecine Moléculaire, Université de Montréal, Québec, Canada
- Research Center, Montreal Heart Institute, Québec, Canada
- Mila - Quebec AI Institute, Université de Montréal, Québec, Canada
- Département de Médecine, Université de Montréal, Québec, Canada
| |
Collapse
|
3
|
Vences M, Patmanidis S, Schmidt JC, Matschiner M, Miralles A, Renner SS. Hapsolutely: a user-friendly tool integrating haplotype phasing, network construction, and haploweb calculation. BIOINFORMATICS ADVANCES 2024; 4:vbae083. [PMID: 38895561 PMCID: PMC11184345 DOI: 10.1093/bioadv/vbae083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/15/2024] [Accepted: 06/04/2024] [Indexed: 06/21/2024]
Abstract
Motivation Haplotype networks are a routine approach to visualize relationships among alleles. Such visual analysis of single-locus data is still of importance, especially in species diagnosis and delimitation, where a limited amount of sequence data usually are available and sufficient, along with other datasets in the framework of integrative taxonomy. In diploid organisms, this often requires separating (phasing) sequences with heterozygotic positions, and typically separate programs are required for phasing, reformatting of input files, and haplotype network construction. We therefore developed Hapsolutely, a user-friendly program with an ergonomic graphical user interface that integrates haplotype phasing from single-locus sequences with five approaches for network/genealogy reconstruction. Results Among the novel options implemented, Hapsolutely integrates phasing and graphical reconstruction steps of haplotype networks, supports input of species partition data in the common SPART and SPART-XML formats, and calculates and visualizes haplowebs and fields for recombination, thus allowing graphical comparison of allele distribution and allele sharing among subsets for the purpose of species delimitation. The new tool has been specifically developed with a focus on the workflow in alpha-taxonomy, where exploring fields for recombination across alternative species partitions may help species delimitation. Availability and implementation Hapsolutely is written in Python, and integrates code from Phase, SeqPHASE, and PopART in C++ and Haxe. Compiled stand-alone executables for MS Windows and Mac OS along with a detailed manual can be downloaded from https://www.itaxotools.org; the source code is openly available on GitHub (https://github.com/iTaxoTools/Hapsolutely).
Collapse
Affiliation(s)
- Miguel Vences
- Division of Evolutionary Biology, Zoological Institute, Technische Universität Braunschweig, 38106 Braunschweig, Germany
| | - Stefanos Patmanidis
- Department of Computer Science, School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
| | - Jan-Christopher Schmidt
- Division of Evolutionary Biology, Zoological Institute, Technische Universität Braunschweig, 38106 Braunschweig, Germany
| | | | - Aurélien Miralles
- Division of Evolutionary Biology, Zoological Institute, Technische Universität Braunschweig, 38106 Braunschweig, Germany
- Institut de Systématique, Évolution, Biodiversité (ISYEB), Muséum National d’Histoire Naturelle, CNRS, Sorbonne Université, EPHE, 75005 Paris, France
| | - Susanne S Renner
- Department of Biology, Washington University, Saint Louis, MO 63130, United States
| |
Collapse
|
4
|
Potdar VA, Laxmivandana R, Walimbe AM, Jadhav SK, Pawar P, Kaledhonkar A, Gupta N, Kaur H, Narayan J, Yadav PD, Abraham P, Cherian S. Phylo-geo haplotype network-based characterization of SARS-CoV-2 strains circulating in India (2020-2022). Indian J Med Res 2024; 159:689-694. [PMID: 39382457 PMCID: PMC11463846 DOI: 10.25259/ijmr_252_23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Indexed: 10/10/2024] Open
Abstract
Background & objectives Genetic analysis of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) strains circulating in India during 2020-2022 was carried out to understand the evolution of potentially expanding and divergent clades. Methods SARS-CoV-2 sequences (n=612) randomly selected from among the sequences of samples collected through a nationwide network of Virus Research Diagnostic Laboratories during 2020 (n=1532) and Indian sequences available in Global Initiative on Sharing All Influenza Data during March 2020-March 2022 (n=53077), were analyzed using the phylo-geo haplotype network approach with reference to the Wuhan prototype sequence. Results On haplotype analysis, 420 haplotypes were revealed from 643 segregating sites among the sequences. Haplotype sharing was noted among the strains from different geographical regions. Nevertheless, the genetic distance among the viral haplotypes from different clades could differentiate the strains into distinct haplo groups regarding variant emergence. Interpretation & conclusions The haplotype analysis revealed that the G and GR clades were co-evolved and an epicentrefor the evolution of the GH, GK and GRA clades. GH was more frequently identified in northern parts of India than in other parts, whereas GK was detected less in north India than in other parts. Thus, the network analysis facilitated a detailed illustration of the pathways of evolution and circulation of SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Varsha Atul Potdar
- National Influenza Centre, ICMR-National Institute of Virology, Pune, India
| | - Rongala Laxmivandana
- National Influenza Centre, ICMR-National Institute of Virology, Pune, India
- Bioinformatics Group, ICMR-National Institute of Virology, Pune, India
| | - Atul M. Walimbe
- Bioinformatics Group, ICMR-National Institute of Virology, Pune, India
| | | | - Pratiksha Pawar
- National Influenza Centre, ICMR-National Institute of Virology, Pune, India
| | - Aditi Kaledhonkar
- National Influenza Centre, ICMR-National Institute of Virology, Pune, India
| | - Nivedita Gupta
- Department of Communicable Diseases, Indian Council of Medical Research, New Delhi, India
| | - Harmanmeet Kaur
- Department of Health Research, Ministry of Health & Family Welfare, Government of India, New Delhi, India
| | - Jitendra Narayan
- Department of Health Research, Ministry of Health & Family Welfare, Government of India, New Delhi, India
| | - Pragya D. Yadav
- Maximum Containment Facility, ICMR-National Institute of Virology, Pune, India
| | - Priya Abraham
- National Influenza Centre, ICMR-National Institute of Virology, Pune, India
- Department of Clinical Virology, Christian Medical College, Vellore, India
| | - Sarah Cherian
- Bioinformatics Group, ICMR-National Institute of Virology, Pune, India
| | - Team VRDL
- For correspondence: Dr Sarah Cherian, Bioinformatics Group, ICMR-National Institute of Virology, Pune 411 001, Maharashtra, India e-mail:
| |
Collapse
|
5
|
Scicluna M, Grenier JC, Poujol R, Lemieux S, Hussin JG. Toward computing attributions for dimensionality reduction techniques. BIOINFORMATICS ADVANCES 2023; 3:vbad097. [PMID: 37720006 PMCID: PMC10502234 DOI: 10.1093/bioadv/vbad097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 06/21/2023] [Accepted: 08/01/2023] [Indexed: 09/19/2023]
Abstract
Summary We describe the problem of computing local feature attributions for dimensionality reduction methods. We use one such method that is well established within the context of supervised classification-using the gradients of target outputs with respect to the inputs-on the popular dimensionality reduction technique t-SNE, widely used in analyses of biological data. We provide an efficient implementation for the gradient computation for this dimensionality reduction technique. We show that our explanations identify significant features using novel validation methodology; using synthetic datasets and the popular MNIST benchmark dataset. We then demonstrate the practical utility of our algorithm by showing that it can produce explanations that agree with domain knowledge on a SARS-CoV-2 sequence dataset. Throughout, we provide a road map so that similar explanation methods could be applied to other dimensionality reduction techniques to rigorously analyze biological datasets. Availability and implementation We have created a Python package that can be installed using the following command: pip install interpretable_tsne. All code used can be found at github.com/MattScicluna/interpretable_tsne.
Collapse
Affiliation(s)
- Matthew Scicluna
- Montreal Heart Institute, Research Center, Montreal, Quebec H1T 1C8, Canada
- Département de Biochimie et Medecine Moleculaire, Université de Montréal, Montreal, Quebec H3C 3J7, Canada
| | | | - Raphaël Poujol
- Montreal Heart Institute, Research Center, Montreal, Quebec H1T 1C8, Canada
| | - Sébastien Lemieux
- Département de Biochimie et Medecine Moleculaire, Université de Montréal, Montreal, Quebec H3C 3J7, Canada
- Mila-Quebec AI institute, Montreal, Quebec H2S 3H1, Canada
| | - Julie G Hussin
- Montreal Heart Institute, Research Center, Montreal, Quebec H1T 1C8, Canada
- Département de Biochimie et Medecine Moleculaire, Université de Montréal, Montreal, Quebec H3C 3J7, Canada
- Mila-Quebec AI institute, Montreal, Quebec H2S 3H1, Canada
- Département de Medecine, Université de Montréal, Montreal, Quebec H3C 3A7, Canada
| |
Collapse
|
6
|
Gazeau S, Deng X, Ooi HK, Mostefai F, Hussin J, Heffernan J, Jenner AL, Craig M. The race to understand immunopathology in COVID-19: Perspectives on the impact of quantitative approaches to understand within-host interactions. IMMUNOINFORMATICS (AMSTERDAM, NETHERLANDS) 2023; 9:100021. [PMID: 36643886 PMCID: PMC9826539 DOI: 10.1016/j.immuno.2023.100021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/16/2022] [Accepted: 01/03/2023] [Indexed: 01/09/2023]
Abstract
The COVID-19 pandemic has revealed the need for the increased integration of modelling and data analysis to public health, experimental, and clinical studies. Throughout the first two years of the pandemic, there has been a concerted effort to improve our understanding of the within-host immune response to the SARS-CoV-2 virus to provide better predictions of COVID-19 severity, treatment and vaccine development questions, and insights into viral evolution and the impacts of variants on immunopathology. Here we provide perspectives on what has been accomplished using quantitative methods, including predictive modelling, population genetics, machine learning, and dimensionality reduction techniques, in the first 26 months of the COVID-19 pandemic approaches, and where we go from here to improve our responses to this and future pandemics.
Collapse
Affiliation(s)
- Sonia Gazeau
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| | - Xiaoyan Deng
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| | - Hsu Kiang Ooi
- Digital Technologies Research Centre, National Research Council Canada, Toronto, Canada
| | - Fatima Mostefai
- Montréal Heart Institute Research Centre, Montréal, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, Canada
| | - Julie Hussin
- Montréal Heart Institute Research Centre, Montréal, Canada
- Department of Medicine, Faculty of Medicine, Université de Montréal, Montréal, Canada
| | - Jane Heffernan
- Modelling Infection and Immunity Lab, Mathematics Statistics, York University, Toronto, Canada
- Centre for Disease Modelling (CDM), Mathematics Statistics, York University, Toronto, Canada
| | - Adrianne L Jenner
- School of Mathematical Sciences, Queensland University of Technology, Brisbane Australia
| | - Morgan Craig
- Department of Mathematics and Statistics, Université de Montréal, Montréal, Canada
- Sainte-Justine University Hospital Research Centre, Montréal, Canada
| |
Collapse
|