1
|
Duan C, Rong S, Buerer L, Neil CR, Savatt JM, Strande NT, Fairbrother WG. One-Size-Fits-Many: Antisense oligonucleotides for rescuing splicing mutations in hotspot exons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.07.627366. [PMID: 39677675 PMCID: PMC11643266 DOI: 10.1101/2024.12.07.627366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Mutations that impact splicing play a significant role in disease etiology but are not fully understood. To characterize the impact of exonic variants on splicing in 71 clinically-actionable disease genes in asymptomatic people, we analyzed 32,112 exonic mutations from ClinVar and Geisinger MyCode using a minigene reporter assay. We identify 1,733 splice-disrupting mutations, of which the most extreme 1-2% of variants are likely to be deleterious. We report that these variants are not distributed evenly across exons but are mostly concentrated in the ∼8% of exons that are most susceptible to splicing mutations (i.e. hotspot exons). We demonstrate that splicing defects in these exons can be reverted by ASOs targeting the splice sites of either their upstream or downstream flanking exons. This finding supports the feasibility of developing single therapeutic ASOs that could revert all splice-altering variants localized to a particular exon.
Collapse
|
2
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
3
|
Sullivan PJ, Quinn JMW, Wu W, Pinese M, Cowley MJ. SpliceVarDB: A comprehensive database of experimentally validated human splicing variants. Am J Hum Genet 2024; 111:2164-2175. [PMID: 39226898 PMCID: PMC11480807 DOI: 10.1016/j.ajhg.2024.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 08/03/2024] [Accepted: 08/06/2024] [Indexed: 09/05/2024] Open
Abstract
Variants that alter gene splicing are estimated to comprise up to a third of all disease-causing variants, yet they are hard to predict from DNA sequencing data alone. To overcome this, many groups are incorporating RNA-based analyses, which are resource intensive, particularly for diagnostic laboratories. There are thousands of functionally validated variants that induce mis-splicing; however, this information is not consolidated, and they are under-represented in ClinVar, which presents a barrier to variant interpretation and can result in duplication of validation efforts. To address this issue, we developed SpliceVarDB, an online database consolidating over 50,000 variants assayed for their effects on splicing in over 8,000 human genes. We evaluated over 500 published data sources and established a spliceogenicity scale to standardize, harmonize, and consolidate variant validation data generated by a range of experimental protocols. According to the strength of their supporting evidence, variants were classified as "splice-altering" (∼25%), "not splice-altering" (∼25%), and "low-frequency splice-altering" (∼50%), which correspond to weak or indeterminate evidence of spliceogenicity. Importantly, 55% of the splice-altering variants in SpliceVarDB are outside the canonical splice sites (5.6% are deep intronic). These variants can support the variant curation diagnostic pathway and can be used to provide the high-quality data necessary to develop more accurate in silico splicing predictors. The variants are accessible through an online platform, SpliceVarDB, with additional features for visualization, variant information, in silico predictions, and validation metrics. SpliceVarDB is a very large collection of splice-altering variants and is available at https://splicevardb.org.
Collapse
Affiliation(s)
- Patricia J Sullivan
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia; School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia; UNSW Centre for Childhood Cancer Research, UNSW Sydney, Sydney, NSW, Australia
| | - Julian M W Quinn
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
| | - Weilin Wu
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia
| | - Mark Pinese
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia; School of Clinical Medicine, UNSW Medicine & Health, UNSW Sydney, Sydney, NSW, Australia
| | - Mark J Cowley
- Children's Cancer Institute, Lowy Cancer Research Centre, UNSW Sydney, Sydney, NSW, Australia.
| |
Collapse
|
4
|
Kobren SN, Moldovan MA, Reimers R, Traviglia D, Li X, Barnum D, Veit A, Corona RI, Carvalho Neto GDV, Willett J, Berselli M, Ronchetti W, Nelson SF, Martinez-Agosto JA, Sherwood R, Krier J, Kohane IS, Sunyaev SR. Joint, multifaceted genomic analysis enables diagnosis of diverse, ultra-rare monogenic presentations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580158. [PMID: 38405764 PMCID: PMC10888768 DOI: 10.1101/2024.02.13.580158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Genomics for rare disease diagnosis has advanced at a rapid pace due to our ability to perform "N-of-1" analyses on individual patients with ultra-rare diseases. The increasing sizes of ultra-rare disease cohorts internationally newly enables cohort-wide analyses for new discoveries, but well-calibrated statistical genetics approaches for jointly analyzing these patients are still under development.1,2 The Undiagnosed Diseases Network (UDN) brings multiple clinical, research and experimental centers under the same umbrella across the United States to facilitate and scale N-of-1 analyses. Here, we present the first joint analysis of whole genome sequencing data of UDN patients across the network. We introduce new, well-calibrated statistical methods for prioritizing disease genes with de novo recurrence and compound heterozygosity. We also detect pathways enriched with candidate and known diagnostic genes. Our computational analysis, coupled with a systematic clinical review, recapitulated known diagnoses and revealed new disease associations. We further release a software package, RaMeDiES, enabling automated cross-analysis of deidentified sequenced cohorts for new diagnostic and research discoveries. Gene-level findings and variant-level information across the cohort are available in a public-facing browser (https://dbmi-bgm.github.io/udn-browser/). These results show that N-of-1 efforts should be supplemented by a joint genomic analysis across cohorts.
Collapse
Affiliation(s)
| | | | | | - Daniel Traviglia
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Xinyun Li
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT
| | | | - Alexander Veit
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Rosario I. Corona
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - George de V. Carvalho Neto
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Julian Willett
- Department of Pathology and Laboratory Medicine, NewYork-Presbyterian Weill Cornell Medical Center, New York, NY
| | - Michele Berselli
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - William Ronchetti
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | - Stanley F. Nelson
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Julian A. Martinez-Agosto
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA
| | - Richard Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA
| | - Joel Krier
- Department of Genetics, Atrius Health, Boston, MA
| | - Isaac S. Kohane
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| | | | - Shamil R. Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA
| |
Collapse
|
5
|
Schoch K, Ruegg MSG, Fellows BJ, Cao J, Uhrig S, Einsele-Scholz S, Biskup S, Hawarden SRA, Salpietro V, Capra V, Brown CM, Accogli A, Shashi V, Bicknell LS. A second hotspot for pathogenic exon-skipping variants in CDC45. Eur J Hum Genet 2024; 32:786-794. [PMID: 38467731 PMCID: PMC11219862 DOI: 10.1038/s41431-024-01583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 02/13/2024] [Accepted: 02/26/2024] [Indexed: 03/13/2024] Open
Abstract
Biallelic pathogenic variants in CDC45 are associated with Meier-Gorlin syndrome with craniosynostosis (MGORS type 7), which also includes short stature and absent/hypoplastic patellae. Identified variants act through a hypomorphic loss of function mechanism, to reduce CDC45 activity and impact DNA replication initiation. In addition to missense and premature termination variants, several pathogenic synonymous variants have been identified, most of which cause increased exon skipping of exon 4, which encodes an essential part of the RecJ-orthologue's DHH domain. Here we have identified a second cohort of families segregating CDC45 variants, where patients have craniosynostosis and a reduction in height, alongside common facial dysmorphisms, including thin eyebrows, consistent with MGORS7. Skipping of exon 15 is a consequence of two different variants, including a shared synonymous variant that is enriched in individuals of East Asian ancestry, while other variants in trans are predicted to alter key intramolecular interactions in α/β domain II, or cause retention of an intron within the 3'UTR. Our cohort and functional data confirm exon skipping is a relatively common pathogenic mechanism in CDC45, and highlights the need for alternative splicing events, such as exon skipping, to be especially considered for variants initially predicted to be less likely to cause the phenotype, particularly synonymous variants.
Collapse
Affiliation(s)
- Kelly Schoch
- Division of Medical Genetics, Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA
| | - Mischa S G Ruegg
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Bridget J Fellows
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Joseph Cao
- Division of Pediatric Radiology, Department of Radiology Duke University School of Medicine, Durham, NC, USA
| | - Sabine Uhrig
- Institute of Clinical Genetics, Klinikum Stuttgart, Stuttgart, Germany
| | | | - Saskia Biskup
- Center for Human Genetics Tuebingen and CeGaT GmbH, Tuebingen, Germany
| | - Samuel R A Hawarden
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | - Vincenzo Salpietro
- Department of Biotechnological and Applied Clinical Sciences, University of L'Aquila, L'Aquila, Italy
| | - Valeria Capra
- Genomics and Clinical Genetics, IRCCS Istituto Giannina Gaslini, Genoa, Italy
| | - Chris M Brown
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Andrea Accogli
- Department of Specialized Medicine, Division of Medical Genetics, McGill University Health Centre, Montreal, QC, Canada
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, QC, Canada
| | - Vandana Shashi
- Division of Medical Genetics, Department of Pediatrics, Duke University School of Medicine, Durham, NC, USA
| | - Louise S Bicknell
- Department of Biochemistry, University of Otago, Dunedin, New Zealand.
| |
Collapse
|
6
|
Uvarova AN, Tkachenko EA, Stasevich EM, Zheremyan EA, Korneev KV, Kuprash DV. Methods for Functional Characterization of Genetic Polymorphisms of Non-Coding Regulatory Regions of the Human Genome. BIOCHEMISTRY. BIOKHIMIIA 2024; 89:1002-1013. [PMID: 38981696 DOI: 10.1134/s0006297924060026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/27/2024] [Accepted: 04/11/2024] [Indexed: 07/11/2024]
Abstract
Currently, numerous associations between genetic polymorphisms and various diseases have been characterized through the Genome-Wide Association Studies. Majority of the clinically significant polymorphisms are localized in non-coding regions of the genome. While modern bioinformatic resources make it possible to predict molecular mechanisms that explain influence of the non-coding polymorphisms on gene expression, such hypotheses require experimental verification. This review discusses the methods for elucidating molecular mechanisms underlying dependence of the disease pathogenesis on specific genetic variants within the non-coding sequences. A particular focus is on the methods for identification of transcription factors with binding efficiency dependent on polymorphic variations. Despite remarkable progress in bioinformatic resources enabling prediction of the impact of polymorphisms on the disease pathogenesis, there is still the need for experimental approaches to investigate this issue.
Collapse
Affiliation(s)
- Aksinya N Uvarova
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia.
| | - Elena A Tkachenko
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, 119234, Russia
| | - Ekaterina M Stasevich
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia
- Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, 141700, Russia
| | - Elina A Zheremyan
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia
| | - Kirill V Korneev
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia
| | - Dmitry V Kuprash
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 119991, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, 119234, Russia
| |
Collapse
|
7
|
Hirschi OR, Felker SA, Rednam SP, Vallance KL, Parsons DW, Roy A, Cooper GM, Plon SE. Combined bioinformatic and splicing analysis of likely benign intronic and synonymous variants reveals evidence for pathogenicity. GENETICS IN MEDICINE OPEN 2024; 2:101850. [PMID: 39669609 PMCID: PMC11613871 DOI: 10.1016/j.gimo.2024.101850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 05/07/2024] [Accepted: 05/08/2024] [Indexed: 12/14/2024]
Abstract
Purpose Clinical variant analysis pipelines likely have poor sensitivity to the effects on splicing from variants beyond 10 to 20 bases of exon-intron boundaries. Here, we demonstrate the value of SpliceAI to inform curation of rare variants previously classified as benign/likely benign (B/LB) under current guidelines. Methods Exome sequencing data from 576 pediatric cancer patients enrolled in the Texas KidsCanSeq study were filtered for intronic or synonymous variants absent from population databases, predicted to alter splicing via SpliceAI (>0.20), and scored >10 by combined annotation-dependent depletion. Rare synonymous or intronic B/LB variants in 61 genes submitted to ClinVar were also evaluated and RNA further assessed in monocyte-derived messenger RNA and/or an in vitro splice reporter assay in HEK-293T cells. Results SpliceAI-supplemented analysis of the KidsCanSeq cohort revealed a DICER1 intronic variant that resulted in missplicing in RNA from a proband with a personal and family history of pleuropulmonary blastoma but negative clinical exome and panel reports. Analysis of 34,188 B/LB ClinVar variants yielded 18 variants predicted to cause disrupted reading frames. Assessment of 8 variants (DICER1 n = 4, CDH1 n = 2, PALB2 n = 2) by in vitro splicing assay demonstrated abnormal splice products (mean 66%; range 6% to 100%). When available, phenotypic information from submitting laboratories demonstrated DICER1-associated tumors in 2 families (1 variant) and breast cancer in 3 families (2 PALB2 variants). Conclusion Incorporation of SpliceAI in variant curation pipelines may improve classification of B/LB intronic and synonymous variants and highlight putative pathogenic variants for functional assays and RNA analysis, thereby increasing diagnostic yield for rare diseases.
Collapse
Affiliation(s)
- Owen R. Hirschi
- Baylor College of Medicine, Houston, TX
- Texas Children’s Cancer Center, Texas Children’s Hospital, Houston, TX
| | | | - Surya P. Rednam
- Baylor College of Medicine, Houston, TX
- Texas Children’s Cancer Center, Texas Children’s Hospital, Houston, TX
| | | | - D. Williams Parsons
- Baylor College of Medicine, Houston, TX
- Texas Children’s Cancer Center, Texas Children’s Hospital, Houston, TX
| | | | | | - Sharon E. Plon
- Baylor College of Medicine, Houston, TX
- Texas Children’s Cancer Center, Texas Children’s Hospital, Houston, TX
| |
Collapse
|
8
|
Holm LL, Doktor TK, Flugt KK, Petersen US, Petersen R, Andresen B. All exons are not created equal-exon vulnerability determines the effect of exonic mutations on splicing. Nucleic Acids Res 2024; 52:4588-4603. [PMID: 38324470 PMCID: PMC11077056 DOI: 10.1093/nar/gkae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 01/05/2024] [Accepted: 01/26/2024] [Indexed: 02/09/2024] Open
Abstract
It is now widely accepted that aberrant splicing of constitutive exons is often caused by mutations affecting cis-acting splicing regulatory elements (SREs), but there is a misconception that all exons have an equal dependency on SREs and thus a similar vulnerability to aberrant splicing. We demonstrate that some exons are more likely to be affected by exonic splicing mutations (ESMs) due to an inherent vulnerability, which is context dependent and influenced by the strength of exon definition. We have developed VulExMap, a tool which is based on empirical data that can designate whether a constitutive exon is vulnerable. Using VulExMap, we find that only 25% of all exons can be categorized as vulnerable, whereas two-thirds of 359 previously reported ESMs in 75 disease genes are located in vulnerable exons. Because VulExMap analysis is based on empirical data on splicing of exons in their endogenous context, it includes all features important in determining the vulnerability. We believe that VulExMap will be an important tool when assessing the effect of exonic mutations by pinpointing whether they are located in exons vulnerable to ESMs.
Collapse
Affiliation(s)
- Lise L Holm
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Thomas K Doktor
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Katharina K Flugt
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Ulrika S S Petersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Rikke Petersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| | - Brage S Andresen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, 5230 Odense M, Denmark
- Villum Center for Bioanalytical Sciences, University of Southern Denmark, 5230 Odense M, Denmark
| |
Collapse
|
9
|
Huckins LM, Brennand K, Bulik CM. Dissecting the biology of feeding and eating disorders. Trends Mol Med 2024; 30:380-391. [PMID: 38431502 DOI: 10.1016/j.molmed.2024.01.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/28/2024] [Accepted: 01/31/2024] [Indexed: 03/05/2024]
Abstract
Feeding and eating disorders (FEDs) are heterogenous and characterized by varying patterns of dysregulated eating and weight. Genome-wide association studies (GWASs) are clarifying their underlying biology and their genetic relationship to other psychiatric and metabolic/anthropometric traits. Genetic research on anorexia nervosa (AN) has identified eight significant loci and uncovered genetic correlations implicating both psychiatric and metabolic/anthropometric risk factors. Careful explication of these metabolic contributors may be key to developing effective and enduring treatments for devastating, life-altering, and frequently lethal illnesses. We discuss clinical phenomenology, genomics, phenomics, intestinal microbiota, and functional genomics and propose a path that translates variants to genes, genes to pathways, and pathways to metabolic outcomes to advance the science and eventually treatment of FEDs.
Collapse
Affiliation(s)
- Laura M Huckins
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA
| | - Kristen Brennand
- Department of Psychiatry, Division of Molecular Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA; Department of Genetics, Wu Tsai Institute, Yale University School of Medicine, New Haven, CT 06511, USA
| | - Cynthia M Bulik
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA; Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden; Department of Nutrition, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
| |
Collapse
|
10
|
Sun J, Noss S, Banerjee D, Das M, Girirajan S. Strategies for dissecting the complexity of neurodevelopmental disorders. Trends Genet 2024; 40:187-202. [PMID: 37949722 PMCID: PMC10872993 DOI: 10.1016/j.tig.2023.10.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/20/2023] [Accepted: 10/16/2023] [Indexed: 11/12/2023]
Abstract
Neurodevelopmental disorders (NDDs) are associated with a wide range of clinical features, affecting multiple pathways involved in brain development and function. Recent advances in high-throughput sequencing have unveiled numerous genetic variants associated with NDDs, which further contribute to disease complexity and make it challenging to infer disease causation and underlying mechanisms. Herein, we review current strategies for dissecting the complexity of NDDs using model organisms, induced pluripotent stem cells, single-cell sequencing technologies, and massively parallel reporter assays. We further highlight single-cell CRISPR-based screening techniques that allow genomic investigation of cellular transcriptomes with high efficiency, accuracy, and throughput. Overall, we provide an integrated review of experimental approaches that can be applicable for investigating a broad range of complex disorders.
Collapse
Affiliation(s)
- Jiawan Sun
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Serena Noss
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Deepro Banerjee
- Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Maitreya Das
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA
| | - Santhosh Girirajan
- Molecular, Cellular, and Integrative Biosciences Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Bioinformatics and Genomics Graduate Program, The Huck Institutes of Life Sciences, University Park, PA 16802, USA; Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA 16802, USA; Department of Anthropology, Pennsylvania State University, University Park, PA 16802, USA.
| |
Collapse
|
11
|
Smith C, Kitzman JO. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. Genome Biol 2023; 24:294. [PMID: 38129864 PMCID: PMC10734170 DOI: 10.1186/s13059-023-03144-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. RESULTS We benchmark eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compare experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms' concordance with MPSA measurements, and with each other, is lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieve the best overall performance at distinguishing disruptive and neutral variants, and controlling for overall call rate genome-wide, SpliceAI and Pangolin have superior sensitivity. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. CONCLUSION SpliceAI and Pangolin show the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
Collapse
Affiliation(s)
- Cathy Smith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Jacob O Kitzman
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
12
|
Hirschi OR, Felker SA, Rednam SP, Vallance KL, Parsons DW, Roy A, Cooper GM, Plon SE. Combined Bioinformatic and Splicing Analysis of Likely Benign Intronic and Synonymous Variants Reveals Evidence for Pathogenicity. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.10.30.23297632. [PMID: 37961416 PMCID: PMC10635218 DOI: 10.1101/2023.10.30.23297632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Background Current clinical variant analysis pipelines focus on coding variants and intronic variants within 10-20 bases of an exon-intron boundary that may affect splicing. The impact of newer splicing prediction algorithms combined with in vitro splicing assays on rare variants currently considered Benign/Likely Benign (B/LB) is unknown. Methods Exome sequencing data from 576 pediatric cancer patients enrolled in the Texas KidsCanSeq study were filtered for intronic or synonymous variants absent from population databases, predicted to alter splicing via SpliceAI (>0.20), and scored as potentially deleterious by CADD (>10.0). Total cellular RNA was extracted from monocytes and RT-PCR products analyzed. Subsequently, rare synonymous or intronic B/LB variants in a subset of genes submitted to ClinVar were similarly evaluated. Variants predicted to lead to a frameshifted splicing product were functionally assessed using an in vitro splicing reporter assay in HEK-293T cells. Results KidsCanSeq exome data analysis revealed a rare, heterozygous, intronic variant (NM_177438.3(DICER1):c.574-26A>G) predicted by SpliceAI to result in gain of a secondary splice acceptor site. The proband had a personal and family history of pleuropulmonary blastoma consistent with DICER1 syndrome but negative clinical sequencing reports. Proband RNA analysis revealed alternative DICER1 transcripts including the SpliceAI-predicted transcript.Similar bioinformatic analysis of synonymous or intronic B/LB variants (n=31,715) in ClinVar from 61 Mendelian disease genes yielded 18 variants, none of which could be scored by MaxEntScan. Eight of these variants were assessed (DICER1 n=4, CDH1 n=2, PALB2 n=2) using in vitro splice reporter assay and demonstrated abnormal splice products (mean 66%; range 6% to 100%). Available phenotypic information from submitting laboratories demonstrated DICER1 phenotypes in 2 families (1 variant) and breast cancer phenotypes for PALB2 in 3 families (2 variants). Conclusions Our results demonstrate the power of newer predictive splicing algorithms to highlight rare variants previously considered B/LB in patients with features of hereditary conditions. Incorporation of SpliceAI annotation of existing variant data combined with either direct RNA analysis or in vitro assays has the potential to identify disease-associated variants in patients without a molecular diagnosis.
Collapse
Affiliation(s)
- Owen R Hirschi
- Baylor College of Medicine, Houston, Texas
- Texas Children's Cancer Center, Texas Children's Hospital, Houston, Texas
| | | | - Surya P Rednam
- Baylor College of Medicine, Houston, Texas
- Texas Children's Cancer Center, Texas Children's Hospital, Houston, Texas
| | | | - D Williams Parsons
- Baylor College of Medicine, Houston, Texas
- Texas Children's Cancer Center, Texas Children's Hospital, Houston, Texas
| | | | | | - Sharon E Plon
- Baylor College of Medicine, Houston, Texas
- Texas Children's Cancer Center, Texas Children's Hospital, Houston, Texas
| |
Collapse
|
13
|
Vihinen M. Nonsynonymous Synonymous Variants Demand for a Paradigm Shift in Genetics. Curr Genomics 2023; 24:18-23. [PMID: 37920730 PMCID: PMC10334700 DOI: 10.2174/1389202924666230417101020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 02/20/2023] [Accepted: 03/01/2023] [Indexed: 11/04/2023] Open
Abstract
Synonymous (also known as silent) variations are by definition not considered to change the coded protein. Still many variations in this category affect either protein abundance or properties. As this situation is confusing, we have recently introduced systematics for synonymous variations and those that may on the surface look like synonymous, but these may affect the coded protein in various ways. A new category, unsense variation, was introduced to describe variants that do not introduce a stop codon into the variation site, but which lead to different types of changes in the coded protein. Many of these variations lead to mRNA degradation and missing protein. Here, consequences of the systematics are discussed from the perspectives of variation annotation and interpretation, evolutionary calculations, nonsynonymous-to-synonymous substitution rates, phylogenetics and other evolutionary inferences that are based on the principle of (nearly) neutral synonymous variations. It may be necessary to reassess published results. Further, databases for synonymous variations and prediction methods for such variations should consider unsense variations. Thus, there is a need to evaluate and reflect principles of numerous aspects in genetics, ranging from variation naming and classification to evolutionary calculations.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, BMC B13, Sweden
| |
Collapse
|
14
|
Mehlferber MM, Kuyumcu-Martinez M, Miller CL, Sheynkman GM. Transcription factors and splice factors - interconnected regulators of stem cell differentiation. CURRENT STEM CELL REPORTS 2023; 9:31-41. [PMID: 38939410 PMCID: PMC11210451 DOI: 10.1007/s40778-023-00227-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/12/2023] [Indexed: 06/29/2024]
Abstract
Purpose of review The underlying molecular mechanisms that direct stem cell differentiation into fully functional, mature cells remain an area of ongoing investigation. Cell state is the product of the combinatorial effect of individual factors operating within a coordinated regulatory network. Here, we discuss the contribution of both gene regulatory and splicing regulatory networks in defining stem cell fate during differentiation and the critical role of protein isoforms in this process. Recent findings We review recent experimental and computational approaches that characterize gene regulatory networks, splice regulatory networks, and the resulting transcriptome and proteome they mediate during differentiation. Such approaches include long-read RNA sequencing, which has demonstrated high-resolution profiling of mRNA isoforms, and Cas13-based CRISPR, which could make possible high-throughput isoform screening. Collectively, these developments enable systems-level profiling of factors contributing to cell state. Summary Overall, gene and splice regulatory networks are important in defining cell state. The emerging high-throughput systems-level approaches will characterize the gene regulatory network components necessary in driving stem cell differentiation.
Collapse
Affiliation(s)
- Madison M Mehlferber
- Department of Biochemistry and Molecular Genetics, University Virginia, Charlottesville, VA 22903
| | - Muge Kuyumcu-Martinez
- Department of Molecular Physiology and Biological Physics, University of Virginia, School of Medicine, Fontaine Medical Office Building 1, 415 Ray C. Hunt Dr, Charlottesville, VA 22903
| | - Clint L Miller
- Department of Public Health Sciences, Department of Biochemistry and Molecular Genetics, and Department of Biomedical Engineering, University of Virginia, Multistory Building, West Complex, 1335 Lee St, Charlottesville, VA 22908, PO Box 800717, Charlottesville, Virginia 22908
| | - Gloria M Sheynkman
- Department of Molecular Physiology and Biological Physics, Center for Public Health Genomics, UVA Comprehensive Cancer Center, Department of Biochemistry and Molecular Genetics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22903
| |
Collapse
|
15
|
Rong S, Neil CR, Welch A, Duan C, Maguire S, Meremikwu IC, Meyerson M, Evans BJ, Fairbrother WG. Large-scale functional screen identifies genetic variants with splicing effects in modern and archaic humans. Proc Natl Acad Sci U S A 2023; 120:e2218308120. [PMID: 37192163 PMCID: PMC10214146 DOI: 10.1073/pnas.2218308120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 04/12/2023] [Indexed: 05/18/2023] Open
Abstract
Humans coexisted and interbred with other hominins which later became extinct. These archaic hominins are known to us only through fossil records and for two cases, genome sequences. Here, we engineer Neanderthal and Denisovan sequences into thousands of artificial genes to reconstruct the pre-mRNA processing patterns of these extinct populations. Of the 5,169 alleles tested in this massively parallel splicing reporter assay (MaPSy), we report 962 exonic splicing mutations that correspond to differences in exon recognition between extant and extinct hominins. Using MaPSy splicing variants, predicted splicing variants, and splicing quantitative trait loci, we show that splice-disrupting variants experienced greater purifying selection in anatomically modern humans than that in Neanderthals. Adaptively introgressed variants were enriched for moderate-effect splicing variants, consistent with positive selection for alternative spliced alleles following introgression. As particularly compelling examples, we characterized a unique tissue-specific alternative splicing variant at the adaptively introgressed innate immunity gene TLR1, as well as a unique Neanderthal introgressed alternative splicing variant in the gene HSPG2 that encodes perlecan. We further identified potentially pathogenic splicing variants found only in Neanderthals and Denisovans in genes related to sperm maturation and immunity. Finally, we found splicing variants that may contribute to variation among modern humans in total bilirubin, balding, hemoglobin levels, and lung capacity. Our findings provide unique insights into natural selection acting on splicing in human evolution and demonstrate how functional assays can be used to identify candidate causal variants underlying differences in gene regulation and phenotype.
Collapse
Affiliation(s)
- Stephen Rong
- Center for Computational Molecular Biology, Brown University, Providence, RI02912
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Christopher R. Neil
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Anastasia Welch
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Chaorui Duan
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Samantha Maguire
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Ijeoma C. Meremikwu
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Malcolm Meyerson
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
| | - Ben J. Evans
- Department of Biology, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - William G. Fairbrother
- Center for Computational Molecular Biology, Brown University, Providence, RI02912
- Department of Molecular Biology, Cell Biology, and Biochemistry, Brown University, Providence, RI02912
- Hassenfeld Child Health Innovation Institute of Brown University, Providence, RI02912
| |
Collapse
|
16
|
Smith C, Kitzman JO. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.04.539398. [PMID: 37205456 PMCID: PMC10187268 DOI: 10.1101/2023.05.04.539398] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Background Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. Results We benchmarked eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compared experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms' concordance with MPSA measurements, and with each other, was lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieved the best overall performance at distinguishing disruptive and neutral variants. Controlling for overall call rate genome-wide, SpliceAI and Pangolin also showed superior overall sensitivity for identifying SDVs. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. Conclusion SpliceAI and Pangolin showed the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
Collapse
Affiliation(s)
- Cathy Smith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Jacob O. Kitzman
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
17
|
When a Synonymous Variant Is Nonsynonymous. Genes (Basel) 2022; 13:genes13081485. [PMID: 36011397 PMCID: PMC9408308 DOI: 10.3390/genes13081485] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 08/17/2022] [Accepted: 08/17/2022] [Indexed: 12/27/2022] Open
Abstract
Term synonymous variation is widely used, but frequently in a wrong or misleading meaning and context. Twenty three point eight % of possible nucleotide substitution types in the universal genetic code are for synonymous amino acid changes, but when these variants have a phenotype and functional effect, they are very seldom synonymous. Such variants may manifest changes at DNA, RNA and/or protein levels. Large numbers of variations are erroneously annotated as synonymous, which causes problems e.g., in clinical genetics and diagnosis of diseases. To facilitate precise communication, novel systematics and nomenclature are introduced for variants that when looking only at the genetic code seem like synonymous, but which have phenotypes. A new term, unsense variant is defined as a substitution in the mRNA coding region that affects gene expression and protein production without introducing a stop codon in the variation site. Such variants are common and need to be correctly annotated. Proper naming and annotation are important also to increase awareness of these variants and their consequences.
Collapse
|