1
|
Bozsik A, Butz H, Grolmusz VK, Pócza T, Patócs A, Papp J. Spectrum and genotyping strategies of "dark" genetic matter in germline susceptibility genes of tumor syndromes. Crit Rev Oncol Hematol 2025; 205:104549. [PMID: 39528122 DOI: 10.1016/j.critrevonc.2024.104549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 10/23/2024] [Accepted: 10/26/2024] [Indexed: 11/16/2024] Open
Abstract
PURPOSE Despite the widespread use of high-throughput genotyping strategies, certain mutation types remain understudied. We provide an overview of these often overlooked mutation types, with representative examples from common hereditary cancer syndromes. METHODS We conducted a comprehensive review of the literature and locus-specific variant databases to summarize the germline pathogenic variants discovered through non-routine genotyping methods. We evaluated appropriate detection and analysis methods tailored for these specific genetic aberrations. Additionally, we performed in silico splice predictions on deep intronic variants registered in the ClinVar database. RESULTS Our study suggests that, aside from founder mutations, most cases are sporadic. However, we anticipate a relatively high likelihood of splice effects for deep intronic variants. The findings underscore the significant clinical utility of genome sequencing techniques and the importance of applying relevant analysis methods.
Collapse
Affiliation(s)
- Anikó Bozsik
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary; Hereditary Tumours Research Group, Eötvös Loránd Research Network, Nagyvárad tér 4, Budapest H-1089, Hungary.
| | - Henriett Butz
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary; Hereditary Tumours Research Group, Eötvös Loránd Research Network, Nagyvárad tér 4, Budapest H-1089, Hungary; Department of Laboratory Medicine, Semmelweis University, Ráth György út 7-9, Budapest H-1122, Hungary; Department of Oncology Biobank, National Institute of Oncology, Budapest 1122, Hungary
| | - Vince Kornél Grolmusz
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary; Hereditary Tumours Research Group, Eötvös Loránd Research Network, Nagyvárad tér 4, Budapest H-1089, Hungary
| | - Tímea Pócza
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary
| | - Attila Patócs
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary; Hereditary Tumours Research Group, Eötvös Loránd Research Network, Nagyvárad tér 4, Budapest H-1089, Hungary; Department of Laboratory Medicine, Semmelweis University, Ráth György út 7-9, Budapest H-1122, Hungary
| | - János Papp
- Department of Molecular Genetics, The National Tumor Biology Laboratory, National Institute of Oncology, Comprehensive Cancer Center, Ráth György út 7-9, Budapest H-1122, Hungary; Hereditary Tumours Research Group, Eötvös Loránd Research Network, Nagyvárad tér 4, Budapest H-1089, Hungary
| |
Collapse
|
2
|
Jónsson BA, Halldórsson GH, Árdal S, Rögnvaldsson S, Einarsson E, Sulem P, Guðbjartsson DF, Melsted P, Stefánsson K, Úlfarsson MÖ. Transformers significantly improve splice site prediction. Commun Biol 2024; 7:1616. [PMID: 39633146 PMCID: PMC11618611 DOI: 10.1038/s42003-024-07298-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 11/20/2024] [Indexed: 12/07/2024] Open
Abstract
Mutations that affect RNA splicing significantly impact human diversity and disease. Here we present a method using transformers, a type of machine learning model, to detect splicing from raw 45,000-nucleotide sequences. We generate embeddings with residual neural networks and apply hard attention to select splice site candidates, enabling efficient training on long sequences. Our method surpasses the leading tool, SpliceAI, in detecting splice sites in GENCODE and ENSEMBL annotations. Using extensive RNA sequencing data from an Icelandic cohort of 17,848 individuals and the Genotype-Tissue Expression (GTEx) project, our method demonstrates superior performance in detecting splice junctions compared to SpliceAI-10k (PR-AUC = 0.834 vs. PR-AUC = 0.820) and is more effective at identifying disease-related splice variants in ClinVar (PR-AUC = 0.997 vs. PR-AUC = 0.996). These advancements hold promise for improving genetic research and clinical diagnostics, potentially leading to better understanding and treatment of splicing-related diseases.
Collapse
Affiliation(s)
- Benedikt A Jónsson
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland
- University of Iceland, Reykjavik, Iceland
| | | | - Steinþór Árdal
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland
- University of Iceland, Reykjavik, Iceland
| | | | | | | | - Daníel F Guðbjartsson
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland
- University of Iceland, Reykjavik, Iceland
| | - Páll Melsted
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland
- University of Iceland, Reykjavik, Iceland
| | - Kári Stefánsson
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland.
- University of Iceland, Reykjavik, Iceland.
| | - Magnús Ö Úlfarsson
- deCODE Genetics/Amgen Inc., Reykjavik, Iceland.
- University of Iceland, Reykjavik, Iceland.
| |
Collapse
|
3
|
Barish S, Lin SJ, Maroofian R, Gezdirici A, Alhebby H, Trimouille A, Biderman Waberski M, Mitani T, Huber I, Tveten K, Holla ØL, Busk ØL, Houlden H, Ghayoor Karimiani E, Beiraghi Toosi M, Shervin Badv R, Najarzadeh Torbati P, Eghbal F, Akhondian J, Al Safar A, Alswaid A, Zifarelli G, Bauer P, Marafi D, Fatih JM, Huang K, Petree C, Calame DG, von der Lippe C, Alkuraya FS, Wali S, Lupski JR, Varshney GK, Posey JE, Pehlivan D. Homozygous variants in WDR83OS lead to a neurodevelopmental disorder with hypercholanemia. Am J Hum Genet 2024; 111:2566-2581. [PMID: 39471804 PMCID: PMC11568760 DOI: 10.1016/j.ajhg.2024.10.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Revised: 10/01/2024] [Accepted: 10/03/2024] [Indexed: 11/01/2024] Open
Abstract
WD repeat domain 83 opposite strand (WDR83OS) encodes the 106-aa (amino acid) protein Asterix, which heterodimerizes with CCDC47 to form the PAT (protein associated with ER translocon) complex. This complex functions as a chaperone for large proteins containing transmembrane domains to ensure proper folding. Until recently, little was known about the role of WDR83OS or CCDC47 in human disease traits. However, biallelic variants in CCDC47 were identified in four unrelated families with trichohepatoneurodevelopmental syndrome, characterized by a neurodevelopmental disorder (NDD) with liver dysfunction. Three affected siblings in an additional family share a homozygous truncating WDR83OS variant and a phenotype of NDD, dysmorphic features, and liver dysfunction. Using family-based rare variant analyses of exome sequencing (ES) data and case matching through GeneMatcher, we describe the clinical phenotypes of 11 additional individuals in eight unrelated families (nine unrelated families, 14 individuals in total) with biallelic putative truncating variants in WDR83OS. Consistent clinical features include NDD (14/14), facial dysmorphism (13/14), intractable itching (9/14), and elevated bile acids (5/6). Whereas bile acids were significantly elevated in 5/6 of individuals tested, bilirubin was normal and liver enzymes were normal to mildly elevated in all 14 individuals. In three of six individuals for whom longitudinal data were available, we observed a progressive reduction in relative head circumference. A zebrafish model lacking Wdr83os function further supports its role in the nervous system, craniofacial development, and lipid absorption. Taken together, our data support a disease-gene association between biallelic loss-of-function of WDR83OS and a neurological disease trait with hypercholanemia.
Collapse
Affiliation(s)
- Scott Barish
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Sheng-Jia Lin
- Genes & Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Reza Maroofian
- Department of Neuromuscular Disorders, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Alper Gezdirici
- Department of Medical Genetics, Basaksehir Cam and Sakura City Hospital, Istanbul 34480, Turkey
| | - Hamoud Alhebby
- Division of Gastroenterology, Department of Pediatrics, Prince Sultan Military Medical City, Riyadh, Saudi Arabia
| | - Aurélien Trimouille
- Department of Medical Genetics, University Hospital of Bordeaux, 33076 Bordeaux, France; INSERM U1211, Laboratoire Maladies Rares: Génétique et Métabolisme, Bordeaux University, Bordeaux, France
| | | | - Tadahiro Mitani
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ilka Huber
- Department of Pediatrics, Sørlandet Hospital, Arendal, Norway
| | - Kristian Tveten
- Department of Medical Genetics, Telemark Hospital Trust, Skien, Norway
| | - Øystein L Holla
- Department of Medical Genetics, Telemark Hospital Trust, Skien, Norway
| | - Øyvind L Busk
- Department of Medical Genetics, Telemark Hospital Trust, Skien, Norway
| | - Henry Houlden
- Department of Neuromuscular Disorders, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Ehsan Ghayoor Karimiani
- Molecular and Clinical Sciences Institute, St. George's, University of London, Cranmer Terrace, London SW17 0RE, UK
| | - Mehran Beiraghi Toosi
- Department of Pediatric Diseases, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran; Neuroscience Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Reza Shervin Badv
- Children's Medical Center, Pediatrics Center of Excellence, Tehran University of Medical Sciences, Tehran, Iran
| | | | - Fatemeh Eghbal
- Department of Medical Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran
| | - Javad Akhondian
- Neuroscience Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Ayat Al Safar
- College of Medicine, Imam Abdulrahman bin Faisal University, Dammam, Saudi Arabia; Department of Paediatrics, King Fahd Hospital of University, Al-khobar, Saudi Arabia
| | - Abdulrahman Alswaid
- King Saud Bin Abdulaziz University for Health Sciences, Department of Pediatrics, MC 1940, King Abdullah Specialized Children's Hospital, Riyadh, Saudi Arabia
| | | | - Peter Bauer
- CENTOGENE GmbH, Am Strande 7, 18055 Rostock, Germany
| | - Dana Marafi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Department of Pediatrics, Faculty of Medicine, Kuwait University, P.O. Box 24923, Safat 13110, Kuwait
| | - Jawid M Fatih
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Kevin Huang
- Genes & Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Cassidy Petree
- Genes & Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Daniel G Calame
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Section of Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Texas Children's Hospital, Houston, TX 77030, USA
| | | | - Fowzan S Alkuraya
- Department of Translational Genomics, Center for Genomic Medicine, King Faisal Specialist Hospital and Research Centre, Riyadh, Saudi Arabia
| | - Sami Wali
- Division of Gastroenterology, Department of Pediatrics, Prince Sultan Military Medical City, Riyadh, Saudi Arabia
| | - James R Lupski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Texas Children's Hospital, Houston, TX 77030, USA; The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA; Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Gaurav K Varshney
- Genes & Human Disease Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
| | - Jennifer E Posey
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
| | - Davut Pehlivan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Section of Neurology and Developmental Neuroscience, Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA; Texas Children's Hospital, Houston, TX 77030, USA.
| |
Collapse
|
4
|
Keefer-Jacques E, Valente N, Jacko AM, Matwijec G, Reese A, Tekriwal A, Loomes KM, Spinner NB, Gilbert MA. Investigation of cryptic JAG1 splice variants as a cause of Alagille syndrome and performance evaluation of splice predictor tools. HGG ADVANCES 2024; 5:100351. [PMID: 39244638 PMCID: PMC11440345 DOI: 10.1016/j.xhgg.2024.100351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Revised: 09/03/2024] [Accepted: 09/04/2024] [Indexed: 09/09/2024] Open
Abstract
Haploinsufficiency of JAG1 is the primary cause of Alagille syndrome (ALGS), a rare, multisystem disorder. The identification of JAG1 intronic variants outside of the canonical splice region as well as missense variants, both of which lead to uncertain associations with disease, confuses diagnostics. Strategies to determine whether these variants affect splicing include the study of patient RNA or minigene constructs, which are not always available or can be laborious to design, as well as the utilization of computational splice prediction tools. These tools, including SpliceAI and Pangolin, use algorithms to calculate the probability that a variant results in a splice alteration, expressed as a Δ score, with higher Δ scores (>0.2 on a 0-1 scale) positively correlated with aberrant splicing. We studied the consequence of 10 putative splice variants in ALGS patient samples through RNA analysis and compared this to SpliceAI and Pangolin predictions. We identified eight variants with aberrant splicing, seven of which had not been previously validated. Combining these data with non-canonical and missense splice variants reported in the literature, we identified a predictive threshold for SpliceAI and Pangolin with high sensitivity (Δ score >0.6). Moreover, we showed reduced specificity for variants with low Δ scores (<0.2), highlighting a limitation of these tools that results in the misidentification of true splice variants. These results improve genomic diagnostics for ALGS by confirming splice effects for seven variants and suggest that the integration of splice prediction tools with RNA analysis is important to ensure accurate clinical variant classifications.
Collapse
Affiliation(s)
- Ernest Keefer-Jacques
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Nicolette Valente
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Anastasia M Jacko
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Grace Matwijec
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Apsara Reese
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Aarna Tekriwal
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kathleen M Loomes
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nancy B Spinner
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Melissa A Gilbert
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA.
| |
Collapse
|
5
|
Johansson PA, Palmer JM, McGrath L, Warrier S, Hamilton HR, Beckman T, D'Mellow MG, Brooks KM, Glasson W, Hayward NK, Pritchard AL. Germline Variants in Patients Affected by Both Uveal and Cutaneous Melanoma. Pigment Cell Melanoma Res 2024. [PMID: 39315505 DOI: 10.1111/pcmr.13199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2024] [Revised: 08/15/2024] [Accepted: 09/03/2024] [Indexed: 09/25/2024]
Abstract
Uveal melanoma (UM) and nonacral cutaneous melanoma (CM) are distinct entities with varied genetic landscapes despite both arising from melanocytes. There are, however, similarities in that they most frequently affect people of European ancestry, and high penetrance germline variants in BAP1, POT1 and CDKN2A have been shown to predispose to both UM and CM. This study aims to further explore germline variants in patients affected by both UM and CM, shedding light on the underlying genetic mechanism causing these diseases. Using exome sequencing we analysed germline DNA samples from a cohort of 83 Australian patients diagnosed with both UM and CM. Eight (10%) patients were identified that carried pathogenic mutations in known melanoma predisposition genes POT1, MITF, OCA2, SLC45A2 and TYR. Three (4%) patients carried pathogenic variants in genes previously linked with other cancer syndromes (ATR, BRIP1 and MSH6) and another three cases carried monoallelic pathogenic variants in recessive cancer genes (xeroderma pigmentosum and Fanconi anaemia), indicating that reduced penetrance of phenotype in these individuals may contribute to the development of both UM and CM. These findings highlight the need for further studies characterising the role of these genes in melanoma susceptibility.
Collapse
Affiliation(s)
- Peter A Johansson
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- University of Queensland, Brisbane, Queensland, Australia
| | - Jane M Palmer
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Lindsay McGrath
- Queensland Ocular Oncology Service, The Terrace Eye Centre, Brisbane, Queensland, Australia
| | - Sunil Warrier
- Queensland Ocular Oncology Service, The Terrace Eye Centre, Brisbane, Queensland, Australia
| | - Hayley R Hamilton
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Timothy Beckman
- Queensland Ocular Oncology Service, The Terrace Eye Centre, Brisbane, Queensland, Australia
| | - Matthew G D'Mellow
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Kelly M Brooks
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- University of Queensland, Brisbane, Queensland, Australia
| | - William Glasson
- Queensland Ocular Oncology Service, The Terrace Eye Centre, Brisbane, Queensland, Australia
| | - Nicholas K Hayward
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Antonia L Pritchard
- QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
- Department of Genetics and Immunology, Division of Biomedical Science, University of the Highlands and Islands, Inverness, Scotland, UK
| |
Collapse
|
6
|
Gilbert MA, Keefer-Jacques E, Jadhav T, Antfolk D, Ming Q, Valente N, Shaw GTW, Sottolano CJ, Matwijec G, Luca VC, Loomes KM, Rajagopalan R, Hayeck TJ, Spinner NB. Functional characterization of 2,832 JAG1 variants supports reclassification for Alagille syndrome and improves guidance for clinical variant interpretation. Am J Hum Genet 2024; 111:1656-1672. [PMID: 39043182 PMCID: PMC11339624 DOI: 10.1016/j.ajhg.2024.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2024] [Revised: 06/15/2024] [Accepted: 06/24/2024] [Indexed: 07/25/2024] Open
Abstract
Pathogenic variants in the JAG1 gene are a primary cause of the multi-system disorder Alagille syndrome. Although variant detection rates are high for this disease, there is uncertainty associated with the classification of missense variants that leads to reduced diagnostic yield. Consequently, up to 85% of reported JAG1 missense variants have uncertain or conflicting classifications. We generated a library of 2,832 JAG1 nucleotide variants within exons 1-7, a region with a high number of reported missense variants, and designed a high-throughput assay to measure JAG1 membrane expression, a requirement for normal function. After calibration using a set of 175 known or predicted pathogenic and benign variants included within the variant library, 486 variants were characterized as functionally abnormal (n = 277 abnormal and n = 209 likely abnormal), of which 439 (90.3%) were missense. We identified divergent membrane expression occurring at specific residues, indicating that loss of the wild-type residue itself does not drive pathogenicity, a finding supported by structural modeling data and with broad implications for clinical variant classification both for Alagille syndrome and globally across other disease genes. Of 144 uncertain variants reported in patients undergoing clinical or research testing, 27 had functionally abnormal membrane expression, and inclusion of our data resulted in the reclassification of 26 to likely pathogenic. Functional evidence augments the classification of genomic variants, reducing uncertainty and improving diagnostics. Inclusion of this repository of functional evidence during JAG1 variant reclassification will significantly affect resolution of variant pathogenicity, making a critical impact on the molecular diagnosis of Alagille syndrome.
Collapse
Affiliation(s)
- Melissa A Gilbert
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA; Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| | - Ernest Keefer-Jacques
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Tanaya Jadhav
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Daniel Antfolk
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Qianqian Ming
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Nicolette Valente
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Grace Tzun-Wen Shaw
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Christopher J Sottolano
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Grace Matwijec
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Vincent C Luca
- Department of Immunology, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL 33612, USA
| | - Kathleen M Loomes
- Division of Pediatric Gastroenterology, Hepatology, and Nutrition, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pediatrics, The Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Ramakrishnan Rajagopalan
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Tristan J Hayeck
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Nancy B Spinner
- Division of Genomic Diagnostics, Department of Pathology and Laboratory Medicine, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Pathology and Laboratory Medicine, The Perelman School of Medicine at The University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
7
|
McCue K, Burge CB. An interpretable model of pre-mRNA splicing for animal and plant genes. SCIENCE ADVANCES 2024; 10:eadn1547. [PMID: 38718117 PMCID: PMC11078188 DOI: 10.1126/sciadv.adn1547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Accepted: 04/04/2024] [Indexed: 05/12/2024]
Abstract
Pre-mRNA splicing is a fundamental step in gene expression, conserved across eukaryotes, in which the spliceosome recognizes motifs at the 3' and 5' splice sites (SSs), excises introns, and ligates exons. SS recognition and pairing is often influenced by protein splicing factors (SFs) that bind to splicing regulatory elements (SREs). Here, we describe SMsplice, a fully interpretable model of pre-mRNA splicing that combines models of core SS motifs, SREs, and exonic and intronic length preferences. We learn models that predict SS locations with 83 to 86% accuracy in fish, insects, and plants and about 70% in mammals. Learned SRE motifs include both known SF binding motifs and unfamiliar motifs, and both motif classes are supported by genetic analyses. Our comparisons across species highlight similarities between non-mammals, increased reliance on intronic SREs in plant splicing, and a greater reliance on SREs in mammalian splicing.
Collapse
Affiliation(s)
- Kayla McCue
- Computational and Systems Biology PhD Program, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Christopher B. Burge
- Computational and Systems Biology PhD Program, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
8
|
Lord J, Oquendo CJ, Wai HA, Douglas AGL, Bunyan DJ, Wang Y, Hu Z, Zeng Z, Danis D, Katsonis P, Williams A, Lichtarge O, Chang Y, Bagnall RD, Mount SM, Matthiasardottir B, Lin C, Hansen TVO, Leman R, Martins A, Houdayer C, Krieger S, Bakolitsa C, Peng Y, Kamandula A, Radivojac P, Baralle D. Predicting the impact of rare variants on RNA splicing in CAGI6. Hum Genet 2024:10.1007/s00439-023-02624-3. [PMID: 38170232 DOI: 10.1007/s00439-023-02624-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 11/18/2023] [Indexed: 01/05/2024]
Abstract
Variants which disrupt splicing are a frequent cause of rare disease that have been under-ascertained clinically. Accurate and efficient methods to predict a variant's impact on splicing are needed to interpret the growing number of variants of unknown significance (VUS) identified by exome and genome sequencing. Here, we present the results of the CAGI6 Splicing VUS challenge, which invited predictions of the splicing impact of 56 variants ascertained clinically and functionally validated to determine splicing impact. The performance of 12 prediction methods, along with SpliceAI and CADD, was compared on the 56 functionally validated variants. The maximum accuracy achieved was 82% from two different approaches, one weighting SpliceAI scores by minor allele frequency, and one applying the recently published Splicing Prediction Pipeline (SPiP). SPiP performed optimally in terms of sensitivity, while an ensemble method combining multiple prediction tools and information from databases exceeded all others for specificity. Several challenge methods equalled or exceeded the performance of SpliceAI, with ultimate choice of prediction method likely to depend on experimental or clinical aims. One quarter of the variants were incorrectly predicted by at least 50% of the methods, highlighting the need for further improvements to splicing prediction methods for successful clinical application.
Collapse
Affiliation(s)
- Jenny Lord
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
| | | | - Htoo A Wai
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
| | - Andrew G L Douglas
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - David J Bunyan
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK
- Wessex Regional Genetics Laboratory, Salisbury District Hospital, Salisbury, UK
| | - Yaqiong Wang
- Center for Molecular Medicine, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, 201102, China
| | - Zhiqiang Hu
- University of California, Berkeley, Berkeley, CA, 94720, USA
| | - Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, 08873, USA
| | - Daniel Danis
- The Jackson Laboratory for Genomic Medicine, 10 Discovery Drive, Farmington, CT, 06032, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Amanda Williams
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Yuchen Chang
- Agnes Ginges Centre for Molecular Cardiology at Centenary Institute, University of Sydney, Sydney, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, Australia
| | - Richard D Bagnall
- Agnes Ginges Centre for Molecular Cardiology at Centenary Institute, University of Sydney, Sydney, Australia
- Faculty of Medicine and Health, University of Sydney, Sydney, Australia
| | - Stephen M Mount
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
| | - Brynja Matthiasardottir
- Graduate Program in Biological Sciences and Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD, USA
- Inflammatory Disease Section, National Human Genome Research Institute, Bethesda, MD, USA
| | | | - Thomas van Overeem Hansen
- Department of Clinical Genetics, University Hospital of Copenhagen, Rigshospitalet, Copenhagen, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Raphael Leman
- Laboratoire de Biologie et Génétique du Cancer, Centre François Baclesse, Caen, France
- Inserm U1245, Cancer Brain and Genomics, Normandie Université, UNICAEN, FHU G4 génomique, Rouen, France
| | - Alexandra Martins
- Inserm U1245, Cancer Brain and Genomics, Normandie Université, UNIROUEN, FHU G4 génomique, Rouen, France
| | - Claude Houdayer
- Inserm U1245, Cancer Brain and Genomics, Normandie Université, UNIROUEN, FHU G4 génomique, Rouen, France
- Department of Genetics, Univ Rouen Normandie, INSERM U1245, FHU-G4 Génomique and CHU Rouen, 76000, Rouen, France
| | - Sophie Krieger
- Laboratoire de Biologie et Génétique du Cancer, Centre François Baclesse, Caen, France
- Inserm U1245, Cancer Brain and Genomics, Normandie Université, UNICAEN, FHU G4 génomique, Rouen, France
| | | | - Yisu Peng
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
| | - Akash Kamandula
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, 02115, USA
| | - Diana Baralle
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK.
- Wessex Clinical Genetics Service, University Hospital Southampton NHS Foundation Trust, Southampton, UK.
| |
Collapse
|
9
|
Smith C, Kitzman JO. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. Genome Biol 2023; 24:294. [PMID: 38129864 PMCID: PMC10734170 DOI: 10.1186/s13059-023-03144-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 12/13/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. RESULTS We benchmark eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compare experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms' concordance with MPSA measurements, and with each other, is lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieve the best overall performance at distinguishing disruptive and neutral variants, and controlling for overall call rate genome-wide, SpliceAI and Pangolin have superior sensitivity. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. CONCLUSION SpliceAI and Pangolin show the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
Collapse
Affiliation(s)
- Cathy Smith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Jacob O Kitzman
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
10
|
Moore AR, Yu J, Pei Y, Cheng EWY, Taylor Tavares AL, Walker WT, Thomas NS, Kamath A, Ibitoye R, Josifova D, Wilsdon A, Ross A, Calder AD, Offiah AC, Wilkie AOM, Taylor JC, Pagnamenta AT. Use of genome sequencing to hunt for cryptic second-hit variants: analysis of 31 cases recruited to the 100 000 Genomes Project. J Med Genet 2023; 60:1235-1244. [PMID: 37558402 PMCID: PMC10715503 DOI: 10.1136/jmg-2023-109362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 07/28/2023] [Indexed: 08/11/2023]
Abstract
BACKGROUND Current clinical testing methods used to uncover the genetic basis of rare disease have inherent limitations, which can lead to causative pathogenic variants being missed. Within the rare disease arm of the 100 000 Genomes Project (100kGP), families were recruited under the clinical indication 'single autosomal recessive mutation in rare disease'. These participants presented with strong clinical suspicion for a specific autosomal recessive disorder, but only one suspected pathogenic variant had been identified through standard-of-care testing. Whole genome sequencing (WGS) aimed to identify cryptic 'second-hit' variants. METHODS To investigate the 31 families with available data that remained unsolved following formal review within the 100kGP, SVRare was used to aggregate structural variants present in <1% of 100kGP participants. Small variants were assessed using population allele frequency data and SpliceAI. Literature searches and publicly available online tools were used for further annotation of pathogenicity. RESULTS Using these strategies, 8/31 cases were solved, increasing the overall diagnostic yield of this cohort from 10/41 (24.4%) to 18/41 (43.9%). Exemplar cases include a patient with cystic fibrosis harbouring a novel exonic LINE1 insertion in CFTR and a patient with generalised arterial calcification of infancy with complex interlinked duplications involving exons 2-6 of ENPP1. Although ambiguous by short-read WGS, the ENPP1 variant structure was resolved using optical genome mapping and RNA analysis. CONCLUSION Systematic examination of cryptic variants across a multi-disease cohort successfully identifies additional pathogenic variants. WGS data analysis in autosomal recessive rare disease should consider complex structural and small intronic variants as potentially pathogenic second hits.
Collapse
Affiliation(s)
- A Rachel Moore
- Wellcome Centre for Human Genetics, NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
- Cambridge Genomics Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Jing Yu
- Wellcome Centre for Human Genetics, NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Yang Pei
- Clinical Genetics Group, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | | | | | - Woolf T Walker
- School of Clinical and Experimental Sciences, Faculty of Medicine, University of Southampton, Southampton, UK
- PCD Centre, University Hospital Southampton NHS Foundation Trust, Southampton, UK
| | - N Simon Thomas
- Wessex Regional Genetics Laboratory, Salisbury NHS Foundation Trust, Salisbury, UK
| | - Arveen Kamath
- All Wales Medical Genomics Service, University Hospital of Wales, Cardiff, UK
| | - Rita Ibitoye
- North West Thames Regional Genetics Service, Northwick Park Hospital, Harrow, London, UK
| | - Dragana Josifova
- Department of Clinical Genetics, Guy's and St Thomas' Hospitals NHS Trust, London, UK
| | - Anna Wilsdon
- Clinical Genetics, Nottingham City Hospital, Nottingham, UK
| | - Alison Ross
- Clinical Genetics, NHS Grampian, Aberdeen, UK
| | - Alistair D Calder
- Radiology Department, Great Ormond Street Hospital for Children NHS Foundation Trust, London, UK
| | - Amaka C Offiah
- Department of Oncology and Metabolism, The University of Sheffield, Sheffield, UK
| | - Andrew O M Wilkie
- Clinical Genetics Group, MRC Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, UK
| | - Jenny C Taylor
- Wellcome Centre for Human Genetics, NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Alistair T Pagnamenta
- Wellcome Centre for Human Genetics, NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
| |
Collapse
|
11
|
Pagnamenta AT, Camps C, Giacopuzzi E, Taylor JM, Hashim M, Calpena E, Kaisaki PJ, Hashimoto A, Yu J, Sanders E, Schwessinger R, Hughes JR, Lunter G, Dreau H, Ferla M, Lange L, Kesim Y, Ragoussis V, Vavoulis DV, Allroggen H, Ansorge O, Babbs C, Banka S, Baños-Piñero B, Beeson D, Ben-Ami T, Bennett DL, Bento C, Blair E, Brasch-Andersen C, Bull KR, Cario H, Cilliers D, Conti V, Davies EG, Dhalla F, Dacal BD, Dong Y, Dunford JE, Guerrini R, Harris AL, Hartley J, Hollander G, Javaid K, Kane M, Kelly D, Kelly D, Knight SJL, Kreins AY, Kvikstad EM, Langman CB, Lester T, Lines KE, Lord SR, Lu X, Mansour S, Manzur A, Maroofian R, Marsden B, Mason J, McGowan SJ, Mei D, Mlcochova H, Murakami Y, Németh AH, Okoli S, Ormondroyd E, Ousager LB, Palace J, Patel SY, Pentony MM, Pugh C, Rad A, Ramesh A, Riva SG, Roberts I, Roy N, Salminen O, Schilling KD, Scott C, Sen A, Smith C, Stevenson M, Thakker RV, Twigg SRF, Uhlig HH, van Wijk R, Vona B, Wall S, Wang J, Watkins H, Zak J, Schuh AH, Kini U, Wilkie AOM, Popitsch N, Taylor JC. Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases. Genome Med 2023; 15:94. [PMID: 37946251 PMCID: PMC10636885 DOI: 10.1186/s13073-023-01240-0] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 09/27/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND Whole genome sequencing is increasingly being used for the diagnosis of patients with rare diseases. However, the diagnostic yields of many studies, particularly those conducted in a healthcare setting, are often disappointingly low, at 25-30%. This is in part because although entire genomes are sequenced, analysis is often confined to in silico gene panels or coding regions of the genome. METHODS We undertook WGS on a cohort of 122 unrelated rare disease patients and their relatives (300 genomes) who had been pre-screened by gene panels or arrays. Patients were recruited from a broad spectrum of clinical specialties. We applied a bioinformatics pipeline that would allow comprehensive analysis of all variant types. We combined established bioinformatics tools for phenotypic and genomic analysis with our novel algorithms (SVRare, ALTSPLICE and GREEN-DB) to detect and annotate structural, splice site and non-coding variants. RESULTS Our diagnostic yield was 43/122 cases (35%), although 47/122 cases (39%) were considered solved when considering novel candidate genes with supporting functional data into account. Structural, splice site and deep intronic variants contributed to 20/47 (43%) of our solved cases. Five genes that are novel, or were novel at the time of discovery, were identified, whilst a further three genes are putative novel disease genes with evidence of causality. We identified variants of uncertain significance in a further fourteen candidate genes. The phenotypic spectrum associated with RMND1 was expanded to include polymicrogyria. Two patients with secondary findings in FBN1 and KCNQ1 were confirmed to have previously unidentified Marfan and long QT syndromes, respectively, and were referred for further clinical interventions. Clinical diagnoses were changed in six patients and treatment adjustments made for eight individuals, which for five patients was considered life-saving. CONCLUSIONS Genome sequencing is increasingly being considered as a first-line genetic test in routine clinical settings and can make a substantial contribution to rapidly identifying a causal aetiology for many patients, shortening their diagnostic odyssey. We have demonstrated that structural, splice site and intronic variants make a significant contribution to diagnostic yield and that comprehensive analysis of the entire genome is essential to maximise the value of clinical genome sequencing.
Collapse
Affiliation(s)
- Alistair T Pagnamenta
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Carme Camps
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Edoardo Giacopuzzi
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Human Technopole, Viale Rita Levi Montalcini 1, 20157, Milan, Italy
| | - John M Taylor
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Mona Hashim
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Eduardo Calpena
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Pamela J Kaisaki
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Akiko Hashimoto
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Jing Yu
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Edward Sanders
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Ron Schwessinger
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Jim R Hughes
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Gerton Lunter
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- University Medical Center Groningen, Groningen University, PO Box 72, 9700 AB, Groningen, The Netherlands
| | - Helene Dreau
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Matteo Ferla
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Lukas Lange
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Yesim Kesim
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Vassilis Ragoussis
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Dimitrios V Vavoulis
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Holger Allroggen
- Neurosciences Department, UHCW NHS Trust, Clifford Bridge Road, Coventry, CV2 2DX, UK
| | - Olaf Ansorge
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Christian Babbs
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Siddharth Banka
- Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Manchester Centre for Genomic Medicine, Saint Mary's Hospital, Oxford Road, Manchester, M13 9WL, UK
| | - Benito Baños-Piñero
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - David Beeson
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Tal Ben-Ami
- Pediatric Hematology-Oncology Unit, Kaplan Medical Center, Rehovot, Israel
| | - David L Bennett
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Celeste Bento
- Hematology Department, Hospitais da Universidade de Coimbra, Coimbra, Portugal
| | - Edward Blair
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Charlotte Brasch-Andersen
- Department of Clinical Genetics, Odense University Hospital and Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Katherine R Bull
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK
| | - Holger Cario
- Department of Pediatrics and Adolescent Medicine, University Medical Center, Eythstrasse 24, 89075, Ulm, Germany
| | - Deirdre Cilliers
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Valerio Conti
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - E Graham Davies
- Department of Immunology, Great Ormond Street Hospital for Children NHS Trust and UCL Great Ormond Street Institute of Child Health, Zayed Centre for Research, 2Nd Floor, 20C Guilford Street, London, WC1N 1DZ, UK
| | - Fatima Dhalla
- Department of Paediatrics, Institute of Developmental and Regenerative Medicine, IMS-Tetsuya Nakamura Building, Old Road Campus, Roosevelt Drive, Oxford, OX3 7TY, UK
| | - Beatriz Diez Dacal
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Yin Dong
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - James E Dunford
- Oxford NIHR Musculoskeletal BRC and Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Old Road, Oxford, OX3 7HE, UK
| | - Renzo Guerrini
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - Adrian L Harris
- Department of Oncology, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
| | - Jane Hartley
- Liver Unit, Birmingham Women's & Children's Hospital and University of Birmingham, Steelhouse Lane, Birmingham, B4 6NH, UK
| | - Georg Hollander
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Kassim Javaid
- Oxford NIHR Musculoskeletal BRC and Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Nuffield Orthopaedic Centre, Old Road, Oxford, OX3 7HE, UK
| | - Maureen Kane
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Maryland, Pharmacy Hall North, Room 731, 20 N. Pine Street, Baltimore, MD, 21201, USA
| | - Deirdre Kelly
- Liver Unit, Birmingham Women's & Children's Hospital and University of Birmingham, Steelhouse Lane, Birmingham, B4 6NH, UK
| | - Dominic Kelly
- Children's Hospital, OUH NHS Foundation Trust, NIHR Oxford BRC, Headley Way, Oxford, OX3 9DU, UK
| | - Samantha J L Knight
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Alexandra Y Kreins
- Department of Immunology, Great Ormond Street Hospital for Children NHS Trust and UCL Great Ormond Street Institute of Child Health, Zayed Centre for Research, 2Nd Floor, 20C Guilford Street, London, WC1N 1DZ, UK
| | - Erika M Kvikstad
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Craig B Langman
- Feinberg School of Medicine, Northwestern University, 211 E Chicago Avenue, Chicago, IL, MS37, USA
| | - Tracy Lester
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Kate E Lines
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Simon R Lord
- Early Phase Clinical Trials Unit, Department of Oncology, University of Oxford, Cancer and Haematology Centre, Level 2 Administration Area, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Xin Lu
- Nuffield Department of Clinical Medicine, Ludwig Institute for Cancer Research, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
| | - Sahar Mansour
- St George's University Hospitals NHS Foundation Trust, Blackshore Road, Tooting, London, SW17 0QT, UK
| | - Adnan Manzur
- MRC Centre for Neuromuscular Diseases, National Hospital for Neurology and Neurosurgery, Queen Square, London, WC1N 3BG, UK
| | - Reza Maroofian
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology and The National Hospital for Neurology and Neurosurgery, London, WC1N 3BG, UK
| | - Brian Marsden
- Nuffield Department of Medicine, Kennedy Institute, University of Oxford, Oxford, OX3 7BN, UK
| | - Joanne Mason
- Yourgene Health Headquarters, Skelton House, Lloyd Street North, Manchester Science Park, Manchester, M15 6SH, UK
| | - Simon J McGowan
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Davide Mei
- Neuroscience Department, Meyer Children's Hospital IRCCS, Viale Pieraccini 24, 50139, Florence, Italy
| | - Hana Mlcochova
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Yoshiko Murakami
- Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka, 565-0871, Japan
| | - Andrea H Németh
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Steven Okoli
- Imperial College NHS Trust, Department of Haematology, Hammersmith Hospital, Du Cane Road, London, W12 0HS, UK
| | - Elizabeth Ormondroyd
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Level 6 West Wing, Oxford, OX3 9DU, JR, UK
| | - Lilian Bomme Ousager
- Department of Clinical Genetics, Odense University Hospital and Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Jacqueline Palace
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Smita Y Patel
- Clinical Immunology, John Radcliffe Hospital, Level 4A, Oxford, OX3 9DU, UK
| | - Melissa M Pentony
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
| | - Chris Pugh
- Nuffield Department of Medicine, University of Oxford, Oxford, OX3 7BN, UK
| | - Aboulfazl Rad
- Department of Otolaryngology-Head & Neck Surgery, Tübingen Hearing Research Centre, Eberhard Karls University, Elfriede-Aulhorn-Str. 5, 72076, Tübingen, Germany
| | - Archana Ramesh
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Simone G Riva
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Irene Roberts
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Noémi Roy
- Department of Haematology, Oxford University Hospitals NHS Foundation Trust, Level 4, Haematology, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Outi Salminen
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Kyleen D Schilling
- Ann & Robert H. Lurie Children's Hospital of Chicago, 225 E Chicago Avenue, Chicago, IL, 60611, USA
| | - Caroline Scott
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Arjune Sen
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Conrad Smith
- Oxford Genetics Laboratories, Oxford University Hospitals NHS Foundation Trust, Churchill Hospital, Old Road, Oxford, OX3 7LE, UK
| | - Mark Stevenson
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Rajesh V Thakker
- University of Oxford, Academic Endocrine Unit, OCDEM, Churchill Hospital, Oxford, OX3 7LJ, UK
| | - Stephen R F Twigg
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Holm H Uhlig
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Paediatrics, University of Oxford, Level 2, Children's Hospital, John Radcliffe Hospital, Oxford, OX3 9DU, UK
- Translational Gastroenterology Unit, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Richard van Wijk
- UMC Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, The Netherlands
| | - Barbara Vona
- Department of Otolaryngology-Head & Neck Surgery, Tübingen Hearing Research Centre, Eberhard Karls University, Elfriede-Aulhorn-Str. 5, 72076, Tübingen, Germany
- Institute of Human Genetics, University Medical Center Göttingen, Heinrich-Düker-Weg 12, 37073, Göttingen, Germany
- Institute for Auditory Neuroscience and InnerEarLab, University Medical Center Göttingen, Robert-Koch-Str. 40, 37075, Göttingen, Germany
| | - Steven Wall
- Oxford Craniofacial Unit, John Radcliffe Hospital, Level LG1, West Wing, Oxford, OX3 9DU, UK
| | - Jing Wang
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, OX3 9DU, UK
| | - Hugh Watkins
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- University of Oxford, Level 6 West Wing, Oxford, OX3 9DU, JR, UK
| | - Jaroslav Zak
- Nuffield Department of Clinical Medicine, Ludwig Institute for Cancer Research, University of Oxford, Old Road Campus Research Building, Oxford, OX3 7DQ, UK
- Department of Immunology and Microbiology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA, 92037, USA
| | - Anna H Schuh
- Department of Oncology, Oxford Molecular Diagnostics Centre, University of Oxford, Level 4, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK
| | - Usha Kini
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Oxford Centre for Genomic Medicine, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 7LE, UK
| | - Andrew O M Wilkie
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- MRC Weatherall Institute of Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DS, UK
| | - Niko Popitsch
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK
- Department of Biochemistry and Cell Biology, Max Perutz Labs, University of Vienna, Vienna BioCenter(VBC), Dr.-Bohr-Gasse 9, 1030, Vienna, Austria
| | - Jenny C Taylor
- Wellcome Centre for Human Genetics, University of Oxford, Old Road Campus, Roosevelt Drive, Oxford, OX3 7BN, UK.
- NIHR Oxford Biomedical Research Centre, John Radcliffe Hospital, Oxford University Hospitals NHS Foundation Trust, Oxford, OX3 9DU, UK.
| |
Collapse
|
12
|
Shen F, Hu C, Huang X, He H, Yang D, Zhao J, Yang X. Advances in alternative splicing identification: deep learning and pantranscriptome. FRONTIERS IN PLANT SCIENCE 2023; 14:1232466. [PMID: 37790793 PMCID: PMC10544900 DOI: 10.3389/fpls.2023.1232466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/28/2023] [Indexed: 10/05/2023]
Abstract
In plants, alternative splicing is a crucial mechanism for regulating gene expression at the post-transcriptional level, which leads to diverse proteins by generating multiple mature mRNA isoforms and diversify the gene regulation. Due to the complexity and variability of this process, accurate identification of splicing events is a vital step in studying alternative splicing. This article presents the application of alternative splicing algorithms with or without reference genomes in plants, as well as the integration of advanced deep learning techniques for improved detection accuracy. In addition, we also discuss alternative splicing studies in the pan-genomic background and the usefulness of integrated strategies for fully profiling alternative splicing.
Collapse
Affiliation(s)
- Fei Shen
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Chenyang Hu
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
- Shanxi Key Lab of Chinese Jujube, College of Life Science, Yan’an University, Yan’an, Shanxi, China
| | - Xin Huang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Hao He
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Deng Yang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Jirong Zhao
- Shanxi Key Lab of Chinese Jujube, College of Life Science, Yan’an University, Yan’an, Shanxi, China
| | - Xiaozeng Yang
- Institute of Biotechnology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| |
Collapse
|
13
|
Joudaki A, Takeda JI, Masuda A, Ode R, Fujiwara K, Ohno K. FexSplice: A LightGBM-Based Model for Predicting the Splicing Effect of a Single Nucleotide Variant Affecting the First Nucleotide G of an Exon. Genes (Basel) 2023; 14:1765. [PMID: 37761905 PMCID: PMC10531444 DOI: 10.3390/genes14091765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/30/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Single nucleotide variants (SNVs) affecting the first nucleotide G of an exon (Fex-SNVs) identified in various diseases are mostly recognized as missense or nonsense variants. Their effect on pre-mRNA splicing has been seldom analyzed, and no curated database is available. We previously reported that Fex-SNVs affect splicing when the length of the polypyrimidine tract is short or degenerate. However, we cannot readily predict the splicing effects of Fex-SNVs. We here scrutinized the available literature and identified 106 splicing-affecting Fex-SNVs based on experimental evidence. We similarly identified 106 neutral Fex-SNVs in the dbSNP database with a global minor allele frequency (MAF) of more than 0.01 and less than 0.50. We extracted 115 features representing the strength of splicing cis-elements and developed machine-learning models with support vector machine, random forest, and gradient boosting to discriminate splicing-affecting and neutral Fex-SNVs. Gradient boosting-based LightGBM outperformed the other two models, and the length and nucleotide compositions of the polypyrimidine tract played critical roles in the discrimination. Recursive feature elimination showed that the LightGBM model using 15 features achieved the best performance with an accuracy of 0.80 ± 0.12 (mean and SD), a Matthews Correlation Coefficient (MCC) of 0.57 ± 0.15, an area under the curve of the receiver operating characteristics curve (AUROC) of 0.86 ± 0.08, and an area under the curve of the precision-recall curve (AUPRC) of 0.87 ± 0.09 using a 10-fold cross-validation. We developed a web service program, named FexSplice that accepts a genomic coordinate either on GRCh37/hg19 or GRCh38/hg38 and returns a predicted probability of aberrant splicing of A, C, and T variants.
Collapse
Affiliation(s)
- Atefeh Joudaki
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan; (A.J.); (J.-i.T.); (A.M.)
| | - Jun-ichi Takeda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan; (A.J.); (J.-i.T.); (A.M.)
| | - Akio Masuda
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan; (A.J.); (J.-i.T.); (A.M.)
| | - Rikumo Ode
- Department of Materials Science and Engineering, Nagoya University Graduate School of Engineering, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan; (R.O.); (K.F.)
| | - Koichi Fujiwara
- Department of Materials Science and Engineering, Nagoya University Graduate School of Engineering, Furo-cho, Chikusa-ku, Nagoya 464-8601, Japan; (R.O.); (K.F.)
| | - Kinji Ohno
- Division of Neurogenetics, Center for Neurological Diseases and Cancer, Nagoya University Graduate School of Medicine, 65 Tsurumai, Showa-ku, Nagoya 466-8550, Japan; (A.J.); (J.-i.T.); (A.M.)
| |
Collapse
|
14
|
Liu X, Shi X, Xin Q, Liu Z, Pan F, Qiao D, Chen M, Zhang Y, Guo W, Li C, Zhang Y, Shao L, Zhang R. Identified eleven exon variants in PKD1 and PKD2 genes that altered RNA splicing by minigene assay. BMC Genomics 2023; 24:407. [PMID: 37468838 PMCID: PMC10354997 DOI: 10.1186/s12864-023-09444-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 06/11/2023] [Indexed: 07/21/2023] Open
Abstract
BACKGROUND Autosomal dominant polycystic kidney disease (ADPKD) is a common monogenic multisystem disease caused primarily by mutations in the PKD1 gene or PKD2 gene. There is increasing evidence that some of these variants, which are described as missense, synonymous or nonsense mutations in the literature or databases, may be deleterious by affecting the pre-mRNA splicing process. RESULTS This study aimed to determine the effect of these PKD1 and PKD2 variants on exon splicing combined with predictive bioinformatics tools and minigene assay. As a result, among the 19 candidate single nucleotide alterations, 11 variants distributed in PKD1 (c.7866C > A, c.7960A > G, c.7979A > T, c.7987C > T, c.11248C > G, c.11251C > T, c.11257C > G, c.11257C > T, c.11346C > T, and c.11393C > G) and PKD2 (c.1480G > T) were identified to result in exon skipping. CONCLUSIONS We confirmed that 11 variants in the gene of PKD1 and PKD2 affect normal splicing by interfering the recognition of classical splicing sites or by disrupting exon splicing enhancers and generating exon splicing silencers. This is the most comprehensive study to date on pre-mRNA splicing of exonic variants in ADPKD-associated disease-causing genes in consideration of the increasing number of identified variants in PKD1 and PKD2 gene in recent years. These results emphasize the significance of assessing the effect of exon single nucleotide variants in ADPKD at the mRNA level.
Collapse
Affiliation(s)
- Xuyan Liu
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Xiaomeng Shi
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Qing Xin
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Zhiying Liu
- Renal Division, Peking University First Hospital, Beijing, China
| | - Fengjiao Pan
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Dan Qiao
- Department of Nephrology, Dalian Medical University, Dalian, China
| | - Mengke Chen
- Department of Nephrology, Shandong First Medical University, Taian, China
| | - Yiyin Zhang
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Wencong Guo
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Changying Li
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China
| | - Yan Zhang
- Department of Nephrology, Weifang Medical University, Weifang, China
| | - Leping Shao
- Department of Nephrology, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China.
| | - Ruixiao Zhang
- Department of Emergency, the Affiliated Qingdao Municipal Hospital of Qingdao University, No.5 Donghai Middle Road, Qingdao, 266071, China.
| |
Collapse
|
15
|
Smith C, Kitzman JO. Benchmarking splice variant prediction algorithms using massively parallel splicing assays. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.04.539398. [PMID: 37205456 PMCID: PMC10187268 DOI: 10.1101/2023.05.04.539398] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Background Variants that disrupt mRNA splicing account for a sizable fraction of the pathogenic burden in many genetic disorders, but identifying splice-disruptive variants (SDVs) beyond the essential splice site dinucleotides remains difficult. Computational predictors are often discordant, compounding the challenge of variant interpretation. Because they are primarily validated using clinical variant sets heavily biased to known canonical splice site mutations, it remains unclear how well their performance generalizes. Results We benchmarked eight widely used splicing effect prediction algorithms, leveraging massively parallel splicing assays (MPSAs) as a source of experimentally determined ground-truth. MPSAs simultaneously assay many variants to nominate candidate SDVs. We compared experimentally measured splicing outcomes with bioinformatic predictions for 3,616 variants in five genes. Algorithms' concordance with MPSA measurements, and with each other, was lower for exonic than intronic variants, underscoring the difficulty of identifying missense or synonymous SDVs. Deep learning-based predictors trained on gene model annotations achieved the best overall performance at distinguishing disruptive and neutral variants. Controlling for overall call rate genome-wide, SpliceAI and Pangolin also showed superior overall sensitivity for identifying SDVs. Finally, our results highlight two practical considerations when scoring variants genome-wide: finding an optimal score cutoff, and the substantial variability introduced by differences in gene model annotation, and we suggest strategies for optimal splice effect prediction in the face of these issues. Conclusion SpliceAI and Pangolin showed the best overall performance among predictors tested, however, improvements in splice effect prediction are still needed especially within exons.
Collapse
Affiliation(s)
- Cathy Smith
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Jacob O. Kitzman
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
16
|
A deep intronic TCTN2 variant activating a cryptic exon predicted by SpliceRover in a patient with Joubert syndrome. J Hum Genet 2023:10.1038/s10038-023-01143-3. [PMID: 36894704 DOI: 10.1038/s10038-023-01143-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 01/26/2023] [Accepted: 02/27/2023] [Indexed: 03/11/2023]
Abstract
The recent introduction of genome sequencing in genetic analysis has led to the identification of pathogenic variants located in deep introns. Recently, several new tools have emerged to predict the impact of variants on splicing. Here, we present a Japanese boy of Joubert syndrome with biallelic TCTN2 variants. Exome sequencing identified only a heterozygous maternal nonsense TCTN2 variant (NM_024809.5:c.916C >T, p.(Gln306Ter)). Subsequent genome sequencing identified a deep intronic variant (c.1033+423G>A) inherited from his father. The machine learning algorithms SpliceAI, Squirls, and Pangolin were unable to predict alterations in splicing by the c.1033+423G>A variant. SpliceRover, a tool for splice site prediction using FASTA sequence, was able to detect a cryptic exon which was 85-bp away from the variant and within the inverted Alu sequence while SpliceRover scores for these splice sites showed slight increase (donor) or decrease (acceptor) between the reference and mutant sequences. RNA sequencing and RT-PCR using urinary cells confirmed inclusion of the cryptic exon. The patient showed major symptoms of TCTN2-related disorders such as developmental delay, dysmorphic facial features and polydactyly. He also showed uncommon features such as retinal dystrophy, exotropia, abnormal pattern of respiration, and periventricular heterotopia, confirming these as one of features of TCTN2-related disorders. Our study highlights usefulness of genome sequencing and RNA sequencing using urinary cells for molecular diagnosis of genetic disorders and suggests that database of cryptic splice sites predicted in introns by SpliceRover using the reference sequences can be helpful in extracting candidate variants from large numbers of intronic variants in genome sequencing.
Collapse
|
17
|
de Sainte Agathe JM, Filser M, Isidor B, Besnard T, Gueguen P, Perrin A, Van Goethem C, Verebi C, Masingue M, Rendu J, Cossée M, Bergougnoux A, Frobert L, Buratti J, Lejeune É, Le Guern É, Pasquier F, Clot F, Kalatzis V, Roux AF, Cogné B, Baux D. SpliceAI-visual: a free online tool to improve SpliceAI splicing variant interpretation. Hum Genomics 2023; 17:7. [PMID: 36765386 PMCID: PMC9912651 DOI: 10.1186/s40246-023-00451-1] [Citation(s) in RCA: 37] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Accepted: 01/18/2023] [Indexed: 02/12/2023] Open
Abstract
SpliceAI is an open-source deep learning splicing prediction algorithm that has demonstrated in the past few years its high ability to predict splicing defects caused by DNA variations. However, its outputs present several drawbacks: (1) although the numerical values are very convenient for batch filtering, their precise interpretation can be difficult, (2) the outputs are delta scores which can sometimes mask a severe consequence, and (3) complex delins are most often not handled. We present here SpliceAI-visual, a free online tool based on the SpliceAI algorithm, and show how it complements the traditional SpliceAI analysis. First, SpliceAI-visual manipulates raw scores and not delta scores, as the latter can be misleading in certain circumstances. Second, the outcome of SpliceAI-visual is user-friendly thanks to the graphical presentation. Third, SpliceAI-visual is currently one of the only SpliceAI-derived implementations able to annotate complex variants (e.g., complex delins). We report here the benefits of using SpliceAI-visual and demonstrate its relevance in the assessment/modulation of the PVS1 classification criteria. We also show how SpliceAI-visual can elucidate several complex splicing defects taken from the literature but also from unpublished cases. SpliceAI-visual is available as a Google Colab notebook and has also been fully integrated in a free online variant interpretation tool, MobiDetails ( https://mobidetails.iurc.montp.inserm.fr/MD ).
Collapse
Affiliation(s)
- Jean-Madeleine de Sainte Agathe
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France.
- Laboratoire de Biologie Médicale Multi-Sites SeqOIA (laboratoire-seqoia.fr/), Paris, France.
| | - Mathilde Filser
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France
| | - Bertrand Isidor
- Nantes Université, CHU Nantes, Service de Génétique Médicale, 44000, Nantes, France
| | - Thomas Besnard
- Nantes Université, CHU Nantes, Service de Génétique Médicale, 44000, Nantes, France
| | - Paul Gueguen
- Laboratoire de Biologie Médicale Multi-Sites SeqOIA (laboratoire-seqoia.fr/), Paris, France
- Service de Génétique, Inserm U1253, CHRU de Tours, Tours, France
| | - Aurélien Perrin
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
| | - Charles Van Goethem
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
| | - Camille Verebi
- Service de Médecine Génomique, Maladies de Système et d'Organe, Fédération de Génétique et de Médecine Génomique, DMU BioPhyGen, APHP Centre-Université Paris Cité, Hôpital Cochin, Paris, France
| | - Marion Masingue
- Centre de référence des maladies neuromusculaires Nord/Est/Ile de France, Hôpital Pitié-Salpêtrière, APHP, Paris, France
| | - John Rendu
- Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, Université Grenoble Alpes, Grenoble, France
| | - Mireille Cossée
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
- PhyMedExp, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Anne Bergougnoux
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
- PhyMedExp, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Laurent Frobert
- Laboratoire de Biologie Médicale Multi-Sites SeqOIA (laboratoire-seqoia.fr/), Paris, France
| | - Julien Buratti
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France
| | - Élodie Lejeune
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France
| | - Éric Le Guern
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France
- Laboratoire de Biologie Médicale Multi-Sites SeqOIA (laboratoire-seqoia.fr/), Paris, France
| | - Florence Pasquier
- Centre mémoire, Inserm U1172 DistALZ, Licend, Univ Lille, CHU Lille, 59000, Lille, France
| | - Fabienne Clot
- Département de Génétique Médicale, Groupe Hospitalier Universitaire de la Pitié Salpêtrière, AP-HP.Sorbonne Université, Laboratoire de Médecine Génomique Sorbonne Université, Paris, France
| | | | - Anne-Françoise Roux
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
- INM, Univ Montpellier, INSERM, CHU Montpellier, Montpellier, France
| | - Benjamin Cogné
- Laboratoire de Biologie Médicale Multi-Sites SeqOIA (laboratoire-seqoia.fr/), Paris, France
- Nantes Université, CHU Nantes, Service de Génétique Médicale, 44000, Nantes, France
| | - David Baux
- Laboratoire de Génétique Moléculaire, CHU de Montpellier, Université de Montpellier, Montpellier, France
- INM, Univ Montpellier, INSERM, CHU Montpellier, Montpellier, France
| |
Collapse
|
18
|
Barbosa P, Savisaar R, Carmo-Fonseca M, Fonseca A. Computational prediction of human deep intronic variation. Gigascience 2022; 12:giad085. [PMID: 37878682 PMCID: PMC10599398 DOI: 10.1093/gigascience/giad085] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 06/07/2023] [Accepted: 09/20/2023] [Indexed: 10/27/2023] Open
Abstract
BACKGROUND The adoption of whole-genome sequencing in genetic screens has facilitated the detection of genetic variation in the intronic regions of genes, far from annotated splice sites. However, selecting an appropriate computational tool to discriminate functionally relevant genetic variants from those with no effect is challenging, particularly for deep intronic regions where independent benchmarks are scarce. RESULTS In this study, we have provided an overview of the computational methods available and the extent to which they can be used to analyze deep intronic variation. We leveraged diverse datasets to extensively evaluate tool performance across different intronic regions, distinguishing between variants that are expected to disrupt splicing through different molecular mechanisms. Notably, we compared the performance of SpliceAI, a widely used sequence-based deep learning model, with that of more recent methods that extend its original implementation. We observed considerable differences in tool performance depending on the region considered, with variants generating cryptic splice sites being better predicted than those that potentially affect splicing regulatory elements. Finally, we devised a novel quantitative assessment of tool interpretability and found that tools providing mechanistic explanations of their predictions are often correct with respect to the ground - information, but the use of these tools results in decreased predictive power when compared to black box methods. CONCLUSIONS Our findings translate into practical recommendations for tool usage and provide a reference framework for applying prediction tools in deep intronic regions, enabling more informed decision-making by practitioners.
Collapse
Affiliation(s)
- Pedro Barbosa
- LASIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016,, Lisboa, Portugal
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028, Lisboa, Portugal
| | | | - Maria Carmo-Fonseca
- Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028, Lisboa, Portugal
| | - Alcides Fonseca
- LASIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016,, Lisboa, Portugal
| |
Collapse
|