1
|
Liu A, Peng B, Pankajam AV, Duong TE, Pryhuber G, Scheuermann RH, Zhang Y. Discovery of optimal cell type classification marker genes from single cell RNA sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.22.590194. [PMID: 38712147 PMCID: PMC11071431 DOI: 10.1101/2024.04.22.590194] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
The use of single cell/nucleus RNA sequencing (scRNA-seq) technologies that quantitively describe cell transcriptional phenotypes is revolutionizing our understanding of cell biology, leading to new insights in cell type identification, disease mechanisms, and drug development. The tremendous growth in scRNA-seq data has posed new challenges in efficiently characterizing data-driven cell types and identifying quantifiable marker genes for cell type classification. The use of machine learning and explainable artificial intelligence has emerged as an effective approach to study large-scale scRNA-seq data. NS-Forest is a random forest machine learning-based algorithm that aims to provide a scalable data-driven solution to identify minimum combinations of necessary and sufficient marker genes that capture cell type identity with maximum classification accuracy. Here, we describe the latest version, NS-Forest version 4.0 and its companion Python package (https://github.com/JCVenterInstitute/NSForest), with several enhancements to select marker gene combinations that exhibit highly selective expression patterns among closely related cell types and more efficiently perform marker gene selection for large-scale scRNA-seq data atlases with millions of cells. By modularizing the final decision tree step, NS-Forest v4.0 can be used to compare the performance of user-defined marker genes with the NS-Forest computationally-derived marker genes based on the decision tree classifiers. To quantify how well the identified markers exhibit the desired pattern of being exclusively expressed at high levels within their target cell types, we introduce the On-Target Fraction metric that ranges from 0 to 1, with a metric of 1 assigned to markers that are only expressed within their target cell types and not in cells of any other cell types. NS-Forest v4.0 outperforms previous versions on its ability to identify markers with higher On-Target Fraction values for closely related cell types and outperforms other marker gene selection approaches at classification with significantly higher F-beta scores when applied to datasets from three human organs - brain, kidney, and lung.
Collapse
|
2
|
Smith C, Telesford KM, Piccirillo SGM, Licon-Munoz Y, Zhang W, Tse KM, Rivas JR, Joshi C, Shah DS, Wu AX, Trivedi R, Christley S, Qian Y, Cowell LG, Scheuermann RH, Stowe AM, Nguyen L, Greenberg BM, Monson NL. Astrocytic stress response is induced by exposure to astrocyte-binding antibodies expressed by plasmablasts from pediatric patients with acute transverse myelitis. J Neuroinflammation 2024; 21:161. [PMID: 38915059 PMCID: PMC11197286 DOI: 10.1186/s12974-024-03127-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 05/08/2024] [Indexed: 06/26/2024] Open
Abstract
BACKGROUND Pediatric acute transverse myelitis (ATM) accounts for 20-30% of children presenting with a first acquired demyelinating syndrome (ADS) and may be the first clinical presentation of a relapsing ADS such as multiple sclerosis (MS). B cells have been strongly implicated in the pathogenesis of adult MS. However, little is known about B cells in pediatric MS, and even less so in pediatric ATM. Our lab previously showed that plasmablasts (PB), the earliest B cell subtype producing antibody, are expanded in adult ATM, and that these PBs produce self-reactive antibodies that target neurons. The goal of this study was to examine PB frequency and phenotype, immunoglobulin selection, and B cell receptor reactivity in pediatric patients presenting with ATM to gain insight to B cell involvement in disease. METHODS We compared the PB frequency and phenotype of 5 pediatric ATM patients and 10 pediatric healthy controls (HC) and compared them to previously reported adult ATM patients using cytometric data. We purified bulk IgG from the plasma samples and cloned 20 recombinant human antibodies (rhAbs) from individual PBs isolated from the blood. Plasma-derived IgG and rhAb autoreactivity was measured by mean fluorescence intensity (MFI) in neurons and astrocytes of murine brain or spinal cord and primary human astrocytes. We determined the potential impact of these rhAbs on astrocyte health by measuring stress and apoptotic response. RESULTS We found that pediatric ATM patients had a reduced frequency of peripheral blood PB. Serum IgG autoreactivity to neurons in EAE spinal cord was similar in the pediatric ATM patients and HC. However, serum IgG autoreactivity to astrocytes in EAE spinal cord was reduced in pediatric ATM patients compared to pediatric HC. Astrocyte-binding strength of rhAbs cloned from PBs was dependent on somatic hypermutation accumulation in the pediatric ATM cohort, but not HC. A similar observation in predilection for astrocyte binding over neuron binding of individual antibodies cloned from PBs was made in EAE brain tissue. Finally, exposure of human primary astrocytes to these astrocyte-binding antibodies increased astrocytic stress but did not lead to apoptosis. CONCLUSIONS Discordance in humoral immune responses to astrocytes may distinguish pediatric ATM from HC.
Collapse
|
3
|
Aevermann BD, Di Domizio J, Olah P, Saidoune F, Armstrong JM, Bachelez H, Barker J, Haniffa M, Julia V, Juul K, Krishnaswamy JK, Litman T, Parsons I, Sarin KY, Schmuth M, Sierra M, Simpson M, Homey B, Griffiths CEM, Scheuermann RH, Gilliet M. Cross-Comparison of Inflammatory Skin Disease Transcriptomics Identifies PTEN as a Pathogenic Disease Classifier in Cutaneous Lupus Erythematosus. J Invest Dermatol 2024; 144:252-262.e4. [PMID: 37598867 DOI: 10.1016/j.jid.2023.06.211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 06/05/2023] [Accepted: 06/09/2023] [Indexed: 08/22/2023]
Abstract
Tissue transcriptomics is used to uncover molecular dysregulations underlying diseases. However, the majority of transcriptomics studies focus on single diseases with limited relevance for understanding the molecular relationship between diseases or for identifying disease-specific markers. In this study, we used a normalization approach to compare gene expression across nine inflammatory skin diseases. The normalized datasets were found to retain differential expression signals that allowed unsupervised disease clustering and identification of disease-specific gene signatures. Using the NS-Forest algorithm, we identified a minimal set of biomarkers and validated their use as diagnostic disease classifier. Among them, PTEN was identified as being a specific marker for cutaneous lupus erythematosus and found to be strongly expressed by lesional keratinocytes in association with pathogenic type I IFNs. In fact, PTEN facilitated the expression of IFN-β and IFN-κ in keratinocytes by promoting activation and nuclear translocation of IRF3. Thus, cross-comparison of tissue transcriptomics is a valid strategy to establish a molecular disease classification and to identify pathogenic disease biomarkers.
Collapse
|
4
|
Beverley J, Babcock S, Carvalho G, Cowell LG, Duesing S, He Y, Hurley R, Merrell E, Scheuermann RH, Smith B. Coordinating virus research: The Virus Infectious Disease Ontology. PLoS One 2024; 19:e0285093. [PMID: 38236918 PMCID: PMC10796065 DOI: 10.1371/journal.pone.0285093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 04/12/2023] [Indexed: 01/22/2024] Open
Abstract
The COVID-19 pandemic prompted immense work on the investigation of the SARS-CoV-2 virus. Rapid, accurate, and consistent interpretation of generated data is thereby of fundamental concern. Ontologies-structured, controlled, vocabularies-are designed to support consistency of interpretation, and thereby to prevent the development of data silos. This paper describes how ontologies are serving this purpose in the COVID-19 research domain, by following principles of the Open Biological and Biomedical Ontology (OBO) Foundry and by reusing existing ontologies such as the Infectious Disease Ontology (IDO) Core, which provides terminological content common to investigations of all infectious diseases. We report here on the development of an IDO extension, the Virus Infectious Disease Ontology (VIDO), a reference ontology covering viral infectious diseases. We motivate term and definition choices, showcase reuse of terms from existing OBO ontologies, illustrate how ontological decisions were motivated by relevant life science research, and connect VIDO to the Coronavirus Infectious Disease Ontology (CIDO). We next use terms from these ontologies to annotate selections from life science research on SARS-CoV-2, highlighting how ontologies employing a common upper-level vocabulary may be seamlessly interwoven. Finally, we outline future work, including bacteria and fungus infectious disease reference ontologies currently under development, then cite uses of VIDO and CIDO in host-pathogen data analytics, electronic health record annotation, and ontology conflict-resolution projects.
Collapse
|
5
|
Chapman OS, Luebeck J, Sridhar S, Wong ITL, Dixit D, Wang S, Prasad G, Rajkumar U, Pagadala MS, Larson JD, He BJ, Hung KL, Lange JT, Dehkordi SR, Chandran S, Adam M, Morgan L, Wani S, Tiwari A, Guccione C, Lin Y, Dutta A, Lo YY, Juarez E, Robinson JT, Korshunov A, Michaels JEA, Cho YJ, Malicki DM, Coufal NG, Levy ML, Hobbs C, Scheuermann RH, Crawford JR, Pomeroy SL, Rich JN, Zhang X, Chang HY, Dixon JR, Bagchi A, Deshpande AJ, Carter H, Fraenkel E, Mischel PS, Wechsler-Reya RJ, Bafna V, Mesirov JP, Chavez L. Circular extrachromosomal DNA promotes tumor heterogeneity in high-risk medulloblastoma. Nat Genet 2023; 55:2189-2199. [PMID: 37945900 PMCID: PMC10703696 DOI: 10.1038/s41588-023-01551-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 09/22/2023] [Indexed: 11/12/2023]
Abstract
Circular extrachromosomal DNA (ecDNA) in patient tumors is an important driver of oncogenic gene expression, evolution of drug resistance and poor patient outcomes. Applying computational methods for the detection and reconstruction of ecDNA across a retrospective cohort of 481 medulloblastoma tumors from 465 patients, we identify circular ecDNA in 82 patients (18%). Patients with ecDNA-positive medulloblastoma were more than twice as likely to relapse and three times as likely to die within 5 years of diagnosis. A subset of tumors harbored multiple ecDNA lineages, each containing distinct amplified oncogenes. Multimodal sequencing, imaging and CRISPR inhibition experiments in medulloblastoma models reveal intratumoral heterogeneity of ecDNA copy number per cell and frequent putative 'enhancer rewiring' events on ecDNA. This study reveals the frequency and diversity of ecDNA in medulloblastoma, stratified into molecular subgroups, and suggests copy number heterogeneity and enhancer rewiring as oncogenic features of ecDNA.
Collapse
|
6
|
Boussaty EC, Tedeschi N, Novotny M, Ninoyu Y, Du E, Draf C, Zhang Y, Manor U, Scheuermann RH, Friedman R. Cochlear transcriptome analysis of an outbred mouse population (CFW). Front Cell Neurosci 2023; 17:1256619. [PMID: 38094513 PMCID: PMC10716316 DOI: 10.3389/fncel.2023.1256619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 10/11/2023] [Indexed: 12/20/2023] Open
Abstract
Age-related hearing loss (ARHL) is the most common cause of hearing loss and one of the most prevalent conditions affecting the elderly worldwide. Despite evidence from our lab and others about its polygenic nature, little is known about the specific genes, cell types, and pathways involved in ARHL, impeding the development of therapeutic interventions. In this manuscript, we describe, for the first time, the complete cell-type specific transcriptome of the aging mouse cochlea using snRNA-seq in an outbred mouse model in relation to auditory threshold variation. Cochlear cell types were identified using unsupervised clustering and annotated via a three-tiered approach-first by linking to expression of known marker genes, then using the NSForest algorithm to select minimum cluster-specific marker genes and reduce dimensional feature space for statistical comparison of our clusters with existing publicly-available data sets on the gEAR website, and finally, by validating and refining the annotations using Multiplexed Error Robust Fluorescence In Situ Hybridization (MERFISH) and the cluster-specific marker genes as probes. We report on 60 unique cell-types expanding the number of defined cochlear cell types by more than two times. Importantly, we show significant specific cell type increases and decreases associated with loss of hearing acuity implicating specific subsets of hair cell subtypes, ganglion cell subtypes, and cell subtypes within the stria vascularis in this model of ARHL. These results provide a view into the cellular and molecular mechanisms responsible for age-related hearing loss and pathways for therapeutic targeting.
Collapse
|
7
|
Zimmerman O, Zimmerman MI, Raju S, Nelson CA, Errico JM, Madden EA, Holmes AC, Hassan AO, VanBlargan LA, Kim AS, Adams LJ, Basore K, Whitener BM, Palakurty S, Davis-Adams HG, Sun C, Gilliland T, Earnest JT, Ma H, Ebel GD, Zmasek C, Scheuermann RH, Klimstra WB, Fremont DH, Diamond MS. Vertebrate-class-specific binding modes of the alphavirus receptor MXRA8. Cell 2023; 186:4818-4833.e25. [PMID: 37804831 PMCID: PMC10615782 DOI: 10.1016/j.cell.2023.09.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 05/09/2023] [Accepted: 09/08/2023] [Indexed: 10/09/2023]
Abstract
MXRA8 is a receptor for chikungunya (CHIKV) and other arthritogenic alphaviruses with mammalian hosts. However, mammalian MXRA8 does not bind to alphaviruses that infect humans and have avian reservoirs. Here, we show that avian, but not mammalian, MXRA8 can act as a receptor for Sindbis, western equine encephalitis (WEEV), and related alphaviruses with avian reservoirs. Structural analysis of duck MXRA8 complexed with WEEV reveals an inverted binding mode compared with mammalian MXRA8 bound to CHIKV. Whereas both domains of mammalian MXRA8 bind CHIKV E1 and E2, only domain 1 of avian MXRA8 engages WEEV E1, and no appreciable contacts are made with WEEV E2. Using these results, we generated a chimeric avian-mammalian MXRA8 decoy-receptor that neutralizes infection of multiple alphaviruses from distinct antigenic groups in vitro and in vivo. Thus, different alphaviruses can bind MXRA8 encoded by different vertebrate classes with distinct engagement modes, which enables development of broad-spectrum inhibitors.
Collapse
|
8
|
Robles EE, Jin Y, Smyth P, Scheuermann RH, Bui JD, Wang HY, Oak J, Qian Y. A cell-level discriminative neural network model for diagnosis of blood cancers. Bioinformatics 2023; 39:btad585. [PMID: 37756695 PMCID: PMC10563151 DOI: 10.1093/bioinformatics/btad585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 09/12/2023] [Accepted: 09/22/2023] [Indexed: 09/29/2023] Open
Abstract
MOTIVATION Precise identification of cancer cells in patient samples is essential for accurate diagnosis and clinical monitoring but has been a significant challenge in machine learning approaches for cancer precision medicine. In most scenarios, training data are only available with disease annotation at the subject or sample level. Traditional approaches separate the classification process into multiple steps that are optimized independently. Recent methods either focus on predicting sample-level diagnosis without identifying individual pathologic cells or are less effective for identifying heterogeneous cancer cell phenotypes. RESULTS We developed a generalized end-to-end differentiable model, the Cell Scoring Neural Network (CSNN), which takes sample-level training data and predicts the diagnosis of the testing samples and the identity of the diagnostic cells in the sample, simultaneously. The cell-level density differences between samples are linked to the sample diagnosis, which allows the probabilities of individual cells being diagnostic to be calculated using backpropagation. We applied CSNN to two independent clinical flow cytometry datasets for leukemia diagnosis. In both qualitative and quantitative assessments, CSNN outperformed preexisting neural network modeling approaches for both cancer diagnosis and cell-level classification. Post hoc decision trees and 2D dot plots were generated for interpretation of the identified cancer cells, showing that the identified cell phenotypes match the cancer endotypes observed clinically in patient cohorts. Independent data clustering analysis confirmed the identified cancer cell populations. AVAILABILITY AND IMPLEMENTATION The source code of CSNN and datasets used in the experiments are publicly available on GitHub (http://github.com/erobl/csnn). Raw FCS files can be downloaded from FlowRepository (ID: FR-FCM-Z6YK).
Collapse
|
9
|
Proal AD, VanElzakker MB, Aleman S, Bach K, Boribong BP, Buggert M, Cherry S, Chertow DS, Davies HE, Dupont CL, Deeks SG, Eimer W, Ely EW, Fasano A, Freire M, Geng LN, Griffin DE, Henrich TJ, Iwasaki A, Izquierdo-Garcia D, Locci M, Mehandru S, Painter MM, Peluso MJ, Pretorius E, Price DA, Putrino D, Scheuermann RH, Tan GS, Tanzi RE, VanBrocklin HF, Yonker LM, Wherry EJ. Author Correction: SARS-CoV-2 reservoir in post-acute sequelae of COVID-19 (PASC). Nat Immunol 2023; 24:1778. [PMID: 37723351 DOI: 10.1038/s41590-023-01646-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/20/2023]
|
10
|
Proal AD, VanElzakker MB, Aleman S, Bach K, Boribong BP, Buggert M, Cherry S, Chertow DS, Davies HE, Dupont CL, Deeks SG, Eimer W, Ely EW, Fasano A, Freire M, Geng LN, Griffin DE, Henrich TJ, Iwasaki A, Izquierdo-Garcia D, Locci M, Mehandru S, Painter MM, Peluso MJ, Pretorius E, Price DA, Putrino D, Scheuermann RH, Tan GS, Tanzi RE, VanBrocklin HF, Yonker LM, Wherry EJ. SARS-CoV-2 reservoir in post-acute sequelae of COVID-19 (PASC). Nat Immunol 2023; 24:1616-1627. [PMID: 37667052 DOI: 10.1038/s41590-023-01601-2] [Citation(s) in RCA: 51] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Accepted: 07/18/2023] [Indexed: 09/06/2023]
Abstract
Millions of people are suffering from Long COVID or post-acute sequelae of COVID-19 (PASC). Several biological factors have emerged as potential drivers of PASC pathology. Some individuals with PASC may not fully clear the coronavirus SARS-CoV-2 after acute infection. Instead, replicating virus and/or viral RNA-potentially capable of being translated to produce viral proteins-persist in tissue as a 'reservoir'. This reservoir could modulate host immune responses or release viral proteins into the circulation. Here we review studies that have identified SARS-CoV-2 RNA/protein or immune responses indicative of a SARS-CoV-2 reservoir in PASC samples. Mechanisms by which a SARS-CoV-2 reservoir may contribute to PASC pathology, including coagulation, microbiome and neuroimmune abnormalities, are delineated. We identify research priorities to guide the further study of a SARS-CoV-2 reservoir in PASC, with the goal that clinical trials of antivirals or other therapeutics with potential to clear a SARS-CoV-2 reservoir are accelerated.
Collapse
|
11
|
Tarke A, Zhang Y, Methot N, Narowski TM, Phillips E, Mallal S, Frazier A, Filaci G, Weiskopf D, Dan JM, Premkumar L, Scheuermann RH, Sette A, Grifoni A. Targets and cross-reactivity of human T cell recognition of common cold coronaviruses. Cell Rep Med 2023; 4:101088. [PMID: 37295422 PMCID: PMC10242702 DOI: 10.1016/j.xcrm.2023.101088] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 03/17/2023] [Accepted: 05/24/2023] [Indexed: 06/12/2023]
Abstract
The coronavirus (CoV) family includes several viruses infecting humans, highlighting the importance of exploring pan-CoV vaccine strategies to provide broad adaptive immune protection. We analyze T cell reactivity against representative Alpha (NL63) and Beta (OC43) common cold CoVs (CCCs) in pre-pandemic samples. S, N, M, and nsp3 antigens are immunodominant, as shown for severe acute respiratory syndrome 2 (SARS2), while nsp2 and nsp12 are Alpha or Beta specific. We further identify 78 OC43- and 87 NL63-specific epitopes, and, for a subset of those, we assess the T cell capability to cross-recognize sequences from representative viruses belonging to AlphaCoV, sarbecoCoV, and Beta-non-sarbecoCoV groups. We find T cell cross-reactivity within the Alpha and Beta groups, in 89% of the instances associated with sequence conservation >67%. However, despite conservation, limited cross-reactivity is observed for sarbecoCoV, indicating that previous CoV exposure is a contributing factor in determining cross-reactivity. Overall, these results provide critical insights in developing future pan-CoV vaccines.
Collapse
|
12
|
Zhang Y, Miller JA, Park J, Lelieveldt BP, Long B, Abdelaal T, Aevermann BD, Biancalani T, Comiter C, Dzyubachyk O, Eggermont J, Langseth CM, Petukhov V, Scalia G, Vaishnav ED, Zhao Y, Lein ES, Scheuermann RH. Reference-based cell type matching of in situ image-based spatial transcriptomics data on primary visual cortex of mouse brain. Sci Rep 2023; 13:9567. [PMID: 37311768 DOI: 10.1038/s41598-023-36638-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Accepted: 06/07/2023] [Indexed: 06/15/2023] Open
Abstract
With the advent of multiplex fluorescence in situ hybridization (FISH) and in situ RNA sequencing technologies, spatial transcriptomics analysis is advancing rapidly, providing spatial location and gene expression information about cells in tissue sections at single cell resolution. Cell type classification of these spatially-resolved cells can be inferred by matching the spatial transcriptomics data to reference atlases derived from single cell RNA-sequencing (scRNA-seq) in which cell types are defined by differences in their gene expression profiles. However, robust cell type matching of the spatially-resolved cells to reference scRNA-seq atlases is challenging due to the intrinsic differences in resolution between the spatial and scRNA-seq data. In this study, we systematically evaluated six computational algorithms for cell type matching across four image-based spatial transcriptomics experimental protocols (MERFISH, smFISH, BaristaSeq, and ExSeq) conducted on the same mouse primary visual cortex (VISp) brain region. We find that many cells are assigned as the same type by multiple cell type matching algorithms and are present in spatial patterns previously reported from scRNA-seq studies in VISp. Furthermore, by combining the results of individual matching strategies into consensus cell type assignments, we see even greater alignment with biological expectations. We present two ensemble meta-analysis strategies used in this study and share the consensus cell type matching results in the Cytosplore Viewer ( https://viewer.cytosplore.org ) for interactive visualization and data exploration. The consensus matching can also guide spatial data analysis using SSAM, allowing segmentation-free cell type assignment.
Collapse
|
13
|
Boussaty EC, Tedeschi N, Novotny M, Ninoyu Y, Du E, Draf C, Zhang Y, Manor U, Scheuermann RH, Friedman R. Cochlear transcriptome analysis of an outbred mouse population (CFW). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.15.528661. [PMID: 36824745 PMCID: PMC9948975 DOI: 10.1101/2023.02.15.528661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
Age-related hearing loss (ARHL) is the most common cause of hearing loss and one of the most prevalent conditions affecting the elderly worldwide. Despite evidence from our lab and others about its polygenic nature, little is known about the specific genes, cell types and pathways involved in ARHL, impeding the development of therapeutic interventions. In this manuscript, we describe, for the first time, the complete cell-type specific transcriptome of the aging mouse cochlea using snRNA-seq in an outbred mouse model in relation to auditory threshold variation. Cochlear cell types were identified using unsupervised clustering and annotated via a three-tiered approach - first by linking to expression of known marker genes, then using the NS-Forest algorithm to select minimum cluster-specific marker genes and reduce dimensional feature space for statistical comparison of our clusters with existing publicly-available data sets on the gEAR website (https://umgear.org/), and finally, by validating and refining the annotations using Multiplexed Error Robust Fluorescence In Situ Hybridization (MERFISH) and the cluster-specific marker genes as probes. We report on 60 unique cell-types expanding the number of defined cochlear cell types by more than two times. Importantly, we show significant specific cell type increases and decreases associated with loss of hearing acuity implicating specific subsets of hair cell subtypes, ganglion cell subtypes, and cell subtypes withing the stria vascularis in this model of ARHL. These results provide a view into the cellular and molecular mechanisms responsible for age-related hearing loss and pathways for therapeutic targeting.
Collapse
|
14
|
Hawrylycz M, Martone ME, Ascoli GA, Bjaalie JG, Dong HW, Ghosh SS, Gillis J, Hertzano R, Haynor DR, Hof PR, Kim Y, Lein E, Liu Y, Miller JA, Mitra PP, Mukamel E, Ng L, Osumi-Sutherland D, Peng H, Ray PL, Sanchez R, Regev A, Ropelewski A, Scheuermann RH, Tan SZK, Thompson CL, Tickle T, Tilgner H, Varghese M, Wester B, White O, Zeng H, Aevermann B, Allemang D, Ament S, Athey TL, Baker C, Baker KS, Baker PM, Bandrowski A, Banerjee S, Bishwakarma P, Carr A, Chen M, Choudhury R, Cool J, Creasy H, D’Orazi F, Degatano K, Dichter B, Ding SL, Dolbeare T, Ecker JR, Fang R, Fillion-Robin JC, Fliss TP, Gee J, Gillespie T, Gouwens N, Zhang GQ, Halchenko YO, Harris NL, Herb BR, Hintiryan H, Hood G, Horvath S, Huo B, Jarecka D, Jiang S, Khajouei F, Kiernan EA, Kir H, Kruse L, Lee C, Lelieveldt B, Li Y, Liu H, Liu L, Markuhar A, Mathews J, Mathews KL, Mezias C, Miller MI, Mollenkopf T, Mufti S, Mungall CJ, Orvis J, Puchades MA, Qu L, Receveur JP, Ren B, Sjoquist N, Staats B, Tward D, van Velthoven CTJ, Wang Q, Xie F, Xu H, Yao Z, Yun Z, Zhang YR, Zheng WJ, Zingg B. A guide to the BRAIN Initiative Cell Census Network data ecosystem. PLoS Biol 2023; 21:e3002133. [PMID: 37390046 PMCID: PMC10313015 DOI: 10.1371/journal.pbio.3002133] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/02/2023] Open
Abstract
Characterizing cellular diversity at different levels of biological organization and across data modalities is a prerequisite to understanding the function of cell types in the brain. Classification of neurons is also essential to manipulate cell types in controlled ways and to understand their variation and vulnerability in brain disorders. The BRAIN Initiative Cell Census Network (BICCN) is an integrated network of data-generating centers, data archives, and data standards developers, with the goal of systematic multimodal brain cell type profiling and characterization. Emphasis of the BICCN is on the whole mouse brain with demonstration of prototype feasibility for human and nonhuman primate (NHP) brains. Here, we provide a guide to the cellular and spatial approaches employed by the BICCN, and to accessing and using these data and extensive resources, including the BRAIN Cell Data Center (BCDC), which serves to manage and integrate data across the ecosystem. We illustrate the power of the BICCN data ecosystem through vignettes highlighting several BICCN analysis and visualization tools. Finally, we present emerging standards that have been developed or adopted toward Findable, Accessible, Interoperable, and Reusable (FAIR) neuroscience. The combined BICCN ecosystem provides a comprehensive resource for the exploration and analysis of cell types in the brain.
Collapse
|
15
|
Tan SZK, Kir H, Aevermann BD, Gillespie T, Harris N, Hawrylycz MJ, Jorstad NL, Lein ES, Matentzoglu N, Miller JA, Mollenkopf TS, Mungall CJ, Ray PL, Sanchez REA, Staats B, Vermillion J, Yadav A, Zhang Y, Scheuermann RH, Osumi-Sutherland D. Author Correction: Brain Data Standards - A method for building data-driven cell-type ontologies. Sci Data 2023; 10:246. [PMID: 37117232 PMCID: PMC10147694 DOI: 10.1038/s41597-023-02165-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/30/2023] Open
|
16
|
Roper RL, Garzino-Demo A, Del Rio C, Bréchot C, Gallo R, Hall W, Esparza J, Reitz M, Schinazi RF, Parrington M, Tartaglia J, Koopmans M, Osorio J, Nitsche A, Huan TB, LeDuc J, Gessain A, Weaver S, Mahalingam S, Abimiku A, Vahlne A, Segales J, Wang L, Isaacs SN, Osterhaus A, Scheuermann RH, McFadden G. Monkeypox (Mpox) requires continued surveillance, vaccines, therapeutics and mitigating strategies. Vaccine 2023; 41:3171-3177. [PMID: 37088603 PMCID: PMC10120921 DOI: 10.1016/j.vaccine.2023.04.010] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 04/03/2023] [Indexed: 04/25/2023]
Abstract
The widespread outbreak of the monkeypox virus (MPXV) recognized in 2022 poses new challenges for public healthcare systems worldwide. With more than 86,000 people infected, there is concern that MPXV may become endemic outside of its original geographical area leading to repeated human spillover infections or continue to be spread person-to-person. Fortunately, classical public health measures (e.g., isolation, contact tracing and quarantine) and vaccination have blunted the spread of the virus, but cases are continuing to be reported in 28 countries in March 2023. We describe here the vaccines and drugs available for the prevention and treatment of MPXV infections. However, although their efficacy against monkeypox (mpox) has been established in animal models, little is known about their efficacy in the current outbreak setting. The continuing opportunity for transmission raises concerns about the potential for evolution of the virus and for expansion beyond the current risk groups. The priorities for action are clear: 1) more data on the efficacy of vaccines and drugs in infected humans must be gathered; 2) global collaborations are necessary to ensure that government authorities work with the private sector in developed and low and middle income countries (LMICs) to provide the availability of treatments and vaccines, especially in historically endemic/enzootic areas; 3) diagnostic and surveillance capacity must be increased to identify areas and populations where the virus is present and may seed resurgence; 4) those at high risk of severe outcomes (e.g., immunocompromised, untreated HIV, pregnant women, and inflammatory skin conditions) must be informed of the risk of infection and be protected from community transmission of MPXV; 5) engagement with the hardest hit communities in a non-stigmatizing way is needed to increase the understanding and acceptance of public health measures; and 6) repositories of monkeypox clinical samples, including blood, fluids, tissues and lesion material must be established for researchers. This MPXV outbreak is a warning that pandemic preparedness plans need additional coordination and resources. We must prepare for continuing transmission, resurgence, and repeated spillovers of MPXV.
Collapse
|
17
|
Robles EE, Jin Y, Smyth P, Scheuermann RH, Bui JD, Wang HY, Oak J, Qian Y. A cell-level discriminative neural network model for diagnosis of blood cancers. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.07.23285606. [PMID: 36798344 PMCID: PMC9934808 DOI: 10.1101/2023.02.07.23285606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Motivation Precise identification of cancer cells in patient samples is essential for accurate diagnosis and clinical monitoring but has been a significant challenge in machine learning approaches for cancer precision medicine. In most scenarios, training data are only available with disease annotation at the subject or sample level. Traditional approaches separate the classification process into multiple steps that are optimized independently. Recent methods either focus on predicting sample-level diagnosis without identifying individual pathologic cells or are less effective for identifying heterogeneous cancer cell phenotypes. Results We developed a generalized end-to-end differentiable model, the Cell Scoring Neural Network (CSNN), which takes the available sample-level training data and predicts both the diagnosis of the testing samples and the identity of the diagnostic cells in the sample, simultaneously. The cell-level density differences between samples are linked to the sample diagnosis, which allows the probabilities of individual cells being diagnostic to be calculated using backpropagation. We applied CSNN to two independent clinical flow cytometry datasets for leukemia diagnosis. In both qualitative and quantitative assessments, CSNN outperformed preexisting neural network modeling approaches for both cancer diagnosis and cell-level classification. Post hoc decision trees and 2D dot plots were generated for interpretation of the identified cancer cells, showing that the identified cell phenotypes match the cancer endotypes observed clinically in patient cohorts. Independent data clustering analysis confirmed the identified cancer cell populations. Availability The source code of CSNN and datasets used in the experiments are publicly available on GitHub and FlowRepository. Contact Edgar E. Robles: roblesee@uci.edu and Yu Qian: mqian@jcvi.org. Supplementary information Supplementary data are available on GitHub and at Bioinformatics online.
Collapse
|
18
|
Tarke A, Zhang Y, Methot N, Narowski TM, Phillips E, Mallal S, Frazier A, Filaci G, Weiskopf D, Dan JM, Premkumar L, Scheuermann RH, Sette A, Grifoni A. Targets and cross-reactivity of human T cell recognition of Common Cold Coronaviruses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522794. [PMID: 36656777 PMCID: PMC9844015 DOI: 10.1101/2023.01.04.522794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The Coronavirus (CoV) family includes a variety of viruses able to infect humans. Endemic CoVs that can cause common cold belong to the alphaCoV and betaCoV genera, with the betaCoV genus also containing subgenera with zoonotic and pandemic concern, including sarbecoCoV (SARS-CoV and SARS-CoV-2) and merbecoCoV (MERS-CoV). It is therefore warranted to explore pan-CoV vaccine concepts, to provide adaptive immune protection against new potential CoV outbreaks, particularly in the context of betaCoV sub lineages. To explore the feasibility of eliciting CD4 + T cell responses widely cross-recognizing different CoVs, we utilized samples collected pre-pandemic to systematically analyze T cell reactivity against representative alpha (NL63) and beta (OC43) common cold CoVs (CCC). Similar to previous findings on SARS-CoV-2, the S, N, M, and nsp3 antigens were immunodominant for both viruses while nsp2 and nsp12 were immunodominant for NL63 and OC43, respectively. We next performed a comprehensive T cell epitope screen, identifying 78 OC43 and 87 NL63-specific epitopes. For a selected subset of 18 epitopes, we experimentally assessed the T cell capability to cross-recognize sequences from representative viruses belonging to alphaCoV, sarbecoCoV, and beta-non-sarbecoCoV groups. We found general conservation within the alpha and beta groups, with cross-reactivity experimentally detected in 89% of the instances associated with sequence conservation of >67%. However, despite sequence conservation, limited cross-reactivity was observed in the case of sarbecoCoV (50% of instances), indicating that previous CoV exposure to viruses phylogenetically closer to this subgenera is a contributing factor in determining cross-reactivity. Overall, these results provided critical insights in the development of future pan-CoV vaccines.
Collapse
|
19
|
Grifoni A, Zhang Y, Tarke A, Sidney J, Rubiro P, Reina-Campos M, Filaci G, Dan JM, Scheuermann RH, Sette A. Defining antigen targets to dissect vaccinia virus and monkeypox virus-specific T cell responses in humans. Cell Host Microbe 2022; 30:1662-1670.e4. [PMID: 36463861 PMCID: PMC9718645 DOI: 10.1016/j.chom.2022.11.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 10/17/2022] [Accepted: 11/07/2022] [Indexed: 12/04/2022]
Abstract
The monkeypox virus (MPXV) outbreak confirmed in May 2022 in non-endemic countries is raising concern about the pandemic potential of novel orthopoxviruses. Little is known regarding MPXV immunity in the context of MPXV infection or vaccination with vaccinia-based vaccines (VACV). As with vaccinia, T cells are likely to provide an important contribution to overall immunity to MPXV. Here, we leveraged the epitope information available in the Immune Epitope Database (IEDB) on VACV to predict potential MPXV targets recognized by CD4+ and CD8+ T cell responses. We found a high degree of conservation between VACV epitopes and MPXV and defined T cell immunodominant targets. These analyses enabled the design of peptide pools able to experimentally detect VACV-specific T cell responses and MPXV cross-reactive T cells in a cohort of vaccinated individuals. Our findings will facilitate the monitoring of cellular immunity following MPXV infection and vaccination.
Collapse
|
20
|
Wallace ZS, Davis J, Niewiadomska AM, Olson RD, Shukla M, Stevens R, Zhang Y, Zmasek CM, Scheuermann RH. Early detection of emerging SARS-CoV-2 variants of interest for experimental evaluation. FRONTIERS IN BIOINFORMATICS 2022; 2:1020189. [PMCID: PMC9638046 DOI: 10.3389/fbinf.2022.1020189] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 10/10/2022] [Indexed: 11/10/2022] Open
Abstract
Since the beginning of the COVID-19 pandemic, SARS-CoV-2 has demonstrated its ability to rapidly and continuously evolve, leading to the emergence of thousands of different sequence variants, many with distinctive phenotypic properties. Fortunately, the broad application of next generation sequencing (NGS) across the globe has produced a wealth of SARS-CoV-2 genome sequences, offering a comprehensive picture of how this virus is evolving so that accurate diagnostics, reliable therapeutics, and prophylactic vaccines against COVID-19 can be developed and maintained. The millions of SARS-CoV-2 sequences deposited into genomic sequencing databases, including GenBank, BV-BRC, and GISAID, are annotated with the dates and geographic locations of sample collection, and can be aligned to and compared with the Wuhan-Hu-1 reference genome to extract their constellation of nucleotide and amino acid substitutions. By aggregating these data into concise datasets, the spread of variants through space and time can be assessed. Variant tracking efforts have initially focused on the Spike protein due to its critical role in viral tropism and antibody neutralization. To identify emerging variants of concern as early as possible, we developed a computational pipeline to process the genomic data and assign risk scores based on both epidemiological and functional parameters. Epidemiological dynamics are used to identify variants exhibiting substantial growth over time and spread across geographical regions. Experimental data that quantify Spike protein regions targeted by adaptive immunity and critical for other virus characteristics are used to predict variants with consequential immunogenic and pathogenic impacts. The growth assessment and functional impact scores are combined to produce a Composite Score for any set of Spike substitutions detected. With this systematic method to routinely score and rank emerging variants, we have established an approach to identify threatening variants early and prioritize them for experimental evaluation.
Collapse
|
21
|
Palatnik-de-Sousa I, Wallace ZS, Cavalcante SC, Ribeiro MPF, Silva JABM, Cavalcante RC, Scheuermann RH, Palatnik-de-Sousa CB. A novel vaccine based on SARS-CoV-2 CD4 + and CD8 + T cell conserved epitopes from variants Alpha to Omicron. Sci Rep 2022; 12:16731. [PMID: 36202985 PMCID: PMC9537284 DOI: 10.1038/s41598-022-21207-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 09/23/2022] [Indexed: 12/03/2022] Open
Abstract
COVID-19 caused, as of September, 1rst, 2022, 599,825,400 confirmed cases, including 6,469,458 deaths. Currently used vaccines reduced severity and mortality but not virus transmission or reinfection by different strains. They are based on the Spike protein of the Wuhan reference virus, which although highly antigenic suffered many mutations in SARS-CoV-2 variants, escaping vaccine-generated immune responses. Multiepitope vaccines based on 100% conserved epitopes of multiple proteins of all SARS-CoV-2 variants, rather than a single highly mutating antigen, could offer more long-lasting protection. In this study, a multiepitope multivariant vaccine was designed using immunoinformatics and in silico approaches. It is composed of highly promiscuous and strong HLA binding CD4+ and CD8+ T cell epitopes of the S, M, N, E, ORF1ab, ORF 6 and ORF8 proteins. Based on the analysis of one genome per WHO clade, the epitopes were 100% conserved among the Wuhan-Hu1, Alpha, Beta, Gamma, Delta, Omicron, Mµ, Zeta, Lambda and R1 variants. An extended epitope-conservancy analysis performed using GISAID metadata of 3,630,666 SARS-CoV-2 genomes of these variants and the additional genomes of the Epsilon, Lota, Theta, Eta, Kappa and GH490 R clades, confirmed the high conservancy of the epitopes. All but one of the CD4 peptides showed a level of conservation greater than 97% among all genomes. All but one of the CD8 epitopes showed a level of conservation greater than 96% among all genomes, with the vast majority greater than 99%. A multiepitope and multivariant recombinant vaccine was designed and it was stable, mildly hydrophobic and non-toxic. The vaccine has good molecular docking with TLR4 and promoted, without adjuvant, strong B and Th1 memory immune responses and secretion of high levels of IL-2, IFN-γ, lower levels of IL-12, TGF-β and IL-10, and no IL-6. Experimental in vivo studies should validate the vaccine’s further use as preventive tool with cross-protective properties.
Collapse
|
22
|
Le H, Peng B, Uy J, Carrillo D, Zhang Y, Aevermann BD, Scheuermann RH. Machine learning for cell type classification from single nucleus RNA sequencing data. PLoS One 2022; 17:e0275070. [PMID: 36149937 PMCID: PMC9506651 DOI: 10.1371/journal.pone.0275070] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 09/09/2022] [Indexed: 11/18/2022] Open
Abstract
With the advent of single cell/nucleus RNA sequencing (sc/snRNA-seq), the field of cell phenotyping is now a data-driven exercise providing statistical evidence to support cell type/state categorization. However, the task of classifying cells into specific, well-defined categories with the empirical data provided by sc/snRNA-seq remains nontrivial due to the difficulty in determining specific differences between related cell types with close transcriptional similarities, resulting in challenges with matching cell types identified in separate experiments. To investigate possible approaches to overcome these obstacles, we explored the use of supervised machine learning methods-logistic regression, support vector machines, random forests, neural networks, and light gradient boosting machine (LightGBM)-as approaches to classify cell types using snRNA-seq datasets from human brain middle temporal gyrus (MTG) and human kidney. Classification accuracy was evaluated using an F-beta score weighted in favor of precision to account for technical artifacts of gene expression dropout. We examined the impact of hyperparameter optimization and feature selection methods on F-beta score performance. We found that the best performing model for granular cell type classification in both datasets is a multinomial logistic regression classifier and that an effective feature selection step was the most influential factor in optimizing the performance of the machine learning pipelines.
Collapse
|
23
|
Zhang Y, Sun H, Mandava A, Aevermann BD, Kollmann TR, Scheuermann RH, Qiu X, Qian Y. FastMix: a versatile data integration pipeline for cell type-specific biomarker inference. Bioinformatics 2022; 38:4735-4744. [PMID: 36018232 PMCID: PMC9801972 DOI: 10.1093/bioinformatics/btac585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 08/18/2022] [Accepted: 08/25/2022] [Indexed: 01/07/2023] Open
Abstract
MOTIVATION Flow cytometry (FCM) and transcription profiling are the two widely used assays in translational immunology research. However, there is no data integration pipeline for analyzing these two types of assays together with experiment variables for biomarker inference. Current FCM data analysis mainly relies on subjective manual gating analysis, which is difficult to be directly integrated with other automated computational methods. Existing deconvolutional analysis of bulk transcriptomics relies on predefined marker genes in the transcriptomics data, which are unavailable for novel cell types and does not utilize the FCM data that provide canonical phenotypic definitions of the cell types. RESULTS We developed a novel analytics pipeline-FastMix-for computational immunology, which integrates flow cytometry, bulk transcriptomics and clinical covariates for identifying cell type-specific gene expression signatures and biomarker genes. FastMix addresses the 'large p, small n' problem in the gene expression and flow cytometry integration analysis via a linear mixed effects model (LMER) for both cross-sectional and longitudinal studies. Its novel moment-based estimator not only reduces bias in parameter estimation but also is more efficient than iterative optimization. The FastMix pipeline also includes a cutting-edge flow cytometry data analysis method-DAFi-for identifying cell populations of interest and their characteristics. Simulation studies showed that FastMix produced smaller type I/II errors than competing methods. Validation using real data of two vaccine studies showed that FastMix identified a consistent set of signature genes as in independent single-cell RNA-seq analysis, producing additional interesting findings. AVAILABILITY AND IMPLEMENTATION Source code of FastMix is publicly available at https://github.com/terrysun0302/FastMix. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
24
|
DeGrace MM, Ghedin E, Frieman MB, Krammer F, Grifoni A, Alisoltani A, Alter G, Amara RR, Baric RS, Barouch DH, Bloom JD, Bloyet LM, Bonenfant G, Boon ACM, Boritz EA, Bratt DL, Bricker TL, Brown L, Buchser WJ, Carreño JM, Cohen-Lavi L, Darling TL, Davis-Gardner ME, Dearlove BL, Di H, Dittmann M, Doria-Rose NA, Douek DC, Drosten C, Edara VV, Ellebedy A, Fabrizio TP, Ferrari G, Fischer WM, Florence WC, Fouchier RAM, Franks J, García-Sastre A, Godzik A, Gonzalez-Reiche AS, Gordon A, Haagmans BL, Halfmann PJ, Ho DD, Holbrook MR, Huang Y, James SL, Jaroszewski L, Jeevan T, Johnson RM, Jones TC, Joshi A, Kawaoka Y, Kercher L, Koopmans MPG, Korber B, Koren E, Koup RA, LeGresley EB, Lemieux JE, Liebeskind MJ, Liu Z, Livingston B, Logue JP, Luo Y, McDermott AB, McElrath MJ, Meliopoulos VA, Menachery VD, Montefiori DC, Mühlemann B, Munster VJ, Munt JE, Nair MS, Netzl A, Niewiadomska AM, O'Dell S, Pekosz A, Perlman S, Pontelli MC, Rockx B, Rolland M, Rothlauf PW, Sacharen S, Scheuermann RH, Schmidt SD, Schotsaert M, Schultz-Cherry S, Seder RA, Sedova M, Sette A, Shabman RS, Shen X, Shi PY, Shukla M, Simon V, Stumpf S, Sullivan NJ, Thackray LB, Theiler J, Thomas PG, Trifkovic S, Türeli S, Turner SA, Vakaki MA, van Bakel H, VanBlargan LA, Vincent LR, Wallace ZS, Wang L, Wang M, Wang P, Wang W, Weaver SC, Webby RJ, Weiss CD, Wentworth DE, Weston SM, Whelan SPJ, Whitener BM, Wilks SH, Xie X, Ying B, Yoon H, Zhou B, Hertz T, Smith DJ, Diamond MS, Post DJ, Suthar MS. Defining the risk of SARS-CoV-2 variants on immune protection. Nature 2022; 605:640-652. [PMID: 35361968 PMCID: PMC9345323 DOI: 10.1038/s41586-022-04690-5] [Citation(s) in RCA: 99] [Impact Index Per Article: 49.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/24/2022] [Indexed: 11/09/2022]
Abstract
The global emergence of many severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants jeopardizes the protective antiviral immunity induced after infection or vaccination. To address the public health threat caused by the increasing SARS-CoV-2 genomic diversity, the National Institute of Allergy and Infectious Diseases within the National Institutes of Health established the SARS-CoV-2 Assessment of Viral Evolution (SAVE) programme. This effort was designed to provide a real-time risk assessment of SARS-CoV-2 variants that could potentially affect the transmission, virulence, and resistance to infection- and vaccine-induced immunity. The SAVE programme is a critical data-generating component of the US Government SARS-CoV-2 Interagency Group to assess implications of SARS-CoV-2 variants on diagnostics, vaccines and therapeutics, and for communicating public health risk. Here we describe the coordinated approach used to identify and curate data about emerging variants, their impact on immunity and effects on vaccine protection using animal models. We report the development of reagents, methodologies, models and notable findings facilitated by this collaborative approach and identify future challenges. This programme is a template for the response to rapidly evolving pathogens with pandemic potential by monitoring viral evolution in the human population to identify variants that could reduce the effectiveness of countermeasures.
Collapse
|
25
|
Zmasek CM, Lefkowitz EJ, Niewiadomska A, Scheuermann RH. Genomic evolution of the Coronaviridae family. Virology 2022; 570:123-133. [PMID: 35398776 PMCID: PMC8965632 DOI: 10.1016/j.virol.2022.03.005] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/11/2022] [Accepted: 03/18/2022] [Indexed: 01/03/2023]
Abstract
The current outbreak of coronavirus disease-2019 (COVID-19) caused by SARS-CoV-2 poses unparalleled challenges to global public health. SARS-CoV-2 is a Betacoronavirus, one of four genera belonging to the Coronaviridae subfamily Orthocoronavirinae. Coronaviridae, in turn, are members of the order Nidovirales, a group of enveloped, positive-stranded RNA viruses. Here we present a systematic phylogenetic and evolutionary study based on protein domain architecture, encompassing the entire proteomes of all Orthocoronavirinae, as well as other Nidovirales. This analysis has revealed that the genomic evolution of Nidovirales is associated with extensive gains and losses of protein domains. In Orthocoronavirinae, the sections of the genomes that show the largest divergence in protein domains are found in the proteins encoded in the amino-terminal end of the polyprotein (PP1ab), the spike protein (S), and many of the accessory proteins. The diversity among the accessory proteins is particularly striking, as each subgenus possesses a set of accessory proteins that is almost entirely specific to that subgenus. The only notable exception to this is ORF3b, which is present and orthologous over all Alphacoronaviruses. In contrast, the membrane protein (M), envelope small membrane protein (E), nucleoprotein (N), as well as proteins encoded in the central and carboxy-terminal end of PP1ab (such as the 3C-like protease, RNA-dependent RNA polymerase, and Helicase) show stable domain architectures across all Orthocoronavirinae. This comprehensive analysis of the Coronaviridae domain architecture has important implication for efforts to develop broadly cross-protective coronavirus vaccines.
Collapse
|